| Age | Commit message (Collapse) | Author | Files | Lines |
|
This adds a new target hook to gimple-sel to allow targets to
do target specific massaging of the gimple IL to prepare for
expand.
Tejas will be sending up part 2 of this soon to help convert
SVE packed boolean VEC_COND_EXPR into something that we can
handle more efficiently if expanded in a different order.
We also want to use this to e.g. for Adv. SIMD prefer avoiding
!= vector compare expressions because the ISA doesn't have
this instruction and so we expand to == + ~ but changing the
expression from a MIN to MAX only for VECTOR_BOOLEAN_TYPE_P
and flipping the operands we can expand more efficient.
These are the kind of things we want to use the hook for,
not generic changes that apply to all target.
gcc/ChangeLog:
* target.def (instruction_selection): New.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate
* gimple-isel.cc (pass_gimple_isel::execute): Use it.
* targhooks.cc (default_instruction_selection): New.
* targhooks.h (default_instruction_selection): New.
|
|
This patch adds a new target hook TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA
that allows the backend to chose an address space other than the generic one.
This hook is only invoked when the compiler can make sure that:
- The object for which the hooks is being invoked will be located
in the desired address space, and
- All accesses to that object will be accesses appropriate for
that address space, and
- The object is read-only and is initialized at load time, and
- The hook invokations are independent of each other. This means
that this hook can be used to optimize code / data consumption.
(Rather than introducing an ABI change, which would be the case
when C++'s vtables were put in a different AS).
To date, there are only two candidates for such compiler generated
lookup tables: CSWTCH tables as generated by tree-switch-conversion.cc,
and CRC lookup tables generated by gimple-crc-optimization.cc.
gcc/
* coretypes.h (enum artificial_rodata): New enum type.
* doc/tm.texi: Rebuild.
* doc/tm.texi.in (TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA):
New hook.
* target.def (addr_sapce.for_artificial_rodata): New DEFHOOK.
* targhooks.cc (default_addr_space_convert): New function.
* targhooks.h (default_addr_space_convert): New prototype.
* tree-switch-conversion.cc (build_one_array) <value_type>:
Set type_quals address-space according to
targetm.addr_space.for_artificial_rodata().
* config/avr/avr.cc (avr_rodata_in_flash_p): Move up.
(TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): Define to...
(avr_addr_space_for_artificial_rodata): ...this new function.
* common/config/avr/avr-common.cc (avr_option_optimization_table):
Adjust -ftree-switch-conversion comment.
|
|
This patch removes the type argument from the vector_misalignment hook.
Ever since we switched from element to byte misalignment its
semantics haven't been particularly clear and nowadays it should be
redundant.
Also, in case of gather/scatter, the patch sets misalignment to the
misalignment of one unit of the vector mode so targets can
distinguish between element size alignment and element mode alignment.
is_packed is now always set, regardless of misalignment.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_builtin_support_vector_misalignment):
Remove type.
* config/arm/arm.cc (arm_builtin_support_vector_misalignment):
Ditto.
* config/epiphany/epiphany.cc (epiphany_support_vector_misalignment):
Ditto.
* config/gcn/gcn.cc (gcn_vectorize_support_vector_misalignment):
Ditto.
* config/loongarch/loongarch.cc (loongarch_builtin_support_vector_misalignment):
Ditto.
* config/riscv/riscv.cc (riscv_support_vector_misalignment):
Ditto.
* config/rs6000/rs6000.cc (rs6000_builtin_support_vector_misalignment):
Ditto.
* config/s390/s390.cc (s390_support_vector_misalignment):
Ditto.
* doc/tm.texi: Adjust vector misalignment docs.
* target.def: Ditto.
* targhooks.cc (default_builtin_support_vector_misalignment):
Remove type.
* targhooks.h (default_builtin_support_vector_misalignment):
Ditto.
* tree-vect-data-refs.cc (vect_can_force_dr_alignment_p):
Set misalignment for gather/scatter and remove type.
(vect_supportable_dr_alignment): Ditto.
|
|
Adds an optimisation in FMV to redirect to a specific target if possible.
A call is redirected to a specific target if both:
- the caller can always call the callee version
- and, it is possible to rule out all higher priority versions of the callee
fmv set. That is estabilished either by the callee being the highest priority
version, or each higher priority version of the callee implying that, were it
resolved, a higher priority version of the caller would have been selected.
For this logic, introduces the new TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A
hook. Adds a full implementation for Aarch64, and a weaker default version
for other targets.
This allows the target to replace the previous optimisation as the new one is
able to cover the same case where two function sets implement the same versions.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_functions_b_resolvable_from_a): New
function.
(TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A): New define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add documentation for
TARGET_OPTION_FUNCTIONS_B_RESOLVABLE_FROM_A.
* multiple_target.cc (redirect_to_specific_clone): Add new optimisation
logic.
(ipa_target_clone): Remove check for TARGET_HAS_FMV_TARGET_ATTRIBUTE.
* target.def: Document new hook..
* attribs.cc: (functions_b_resolvable_from_a) New function.
* attribs.h: (functions_b_resolvable_from_a) New function.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/fmv-selection1.C: New test.
* g++.target/aarch64/fmv-selection2.C: New test.
* g++.target/aarch64/fmv-selection3.C: New test.
* g++.target/aarch64/fmv-selection4.C: New test.
* g++.target/aarch64/fmv-selection5.C: New test.
* g++.target/aarch64/fmv-selection6.C: New test.
* g++.target/aarch64/fmv-selection7.C: New test.
|
|
This change refactors FMV handling in the frontend to allows greater
reasoning about versions in shared code.
This is needed for allowing target_clones and target_versions to be used
together in a function set, as there is then two distinct concerns when
encountering two declarations that previously were conflated:
1. Are these two declarations completely disjoint FMV declarations
(ie. the sets of versions they define have no overlap). If so, they don't
conflict so there is no need to merge and both can be pushed.
2. For two declarations that aren't completely disjoint, are they matching
and therefore mergeable. (ie. two target_clone decls that define the same set
of versions, or an un-annotated declaration, and a target_clones definition
containing the default version). If so, continue to the existing merging logic
to try to merge these and diagnose if it's not possible.
If not, then diagnose the conflicting declarations.
To do this the common_function_versions function has been renamed
disjoint_function_versions (meaning, are the version sets defined by these
two decl's completely distinct from each other).
A new hook called same_function_version is introduces taking two
string_slice's (each representing a single version) and determining if they
define the same version.
A new function, called diagnose_versioned_decls is added, which checks
if two decls (with overlapping version sets) can be merged and diagnose when
they cannot be (only in terms of the attributes, the existing logic is used to
detect other mergeability conflicts like redefinition).
This only effects targets with TARGET_HAS_FMV_TARGET_ATTRIBUTE set to false.
(ie. aarch64 and riscv), the existing logic for i86 and ppc is unchanged.
This also means the same function version hook is only used for aarch64 and
riscv.
gcc/ChangeLog:
* attribs.h (common_function_versions): Removed.
* attribs.cc (common_function_versions): Removed.
* config/aarch64/aarch64.cc (aarch64_common_function_versions): Removed.
(aarch64_same_function_versions): New function to check if two version
strings imply the same version.
(TARGET_OPTION_FUNCTION_VERSIONS): Removed.
(TARGET_OPTION_SAME_FUNCTION_VERSIONS): New macro.
* config/i386/i386.cc (TARGET_OPTION_FUNCTION_VERSIONS): Removed.
* config/rs6000/rs6000.cc (TARGET_OPTION_FUNCTION_VERSIONS): Removed.
* config/riscv/riscv.cc (riscv_same_function_versions): New function
to check if two version strings imply the same version.
(riscv_common_function_versions): Removed.
(TARGET_OPTION_FUNCTION_VERSIONS): Removed.
(TARGET_OPTION_SAME_FUNCTION_VERSIONS): New macro.
* doc/tm.texi: Regenerated.
* target.def: Remove common_version hook and add same_function_version
hook.
* doc/tm.texi.in: Ditto.
* tree.cc (distinct_version_decls): New function.
(mergeable_version_decls): Ditto.
* tree.h (distinct_version_decls): New function.
(mergeable_version_decls): Ditto.
* hooks.h (hook_stringslice_stringslice_unreachable): New function.
* hooks.cc (hook_stringslice_stringslice_unreachable): New function.
gcc/cp/ChangeLog:
* class.cc (resolve_address_of_overloaded_function): Updated to use
dijoint_versions_decls instead of common_function_version hook.
* decl.cc (decls_match): Refacture to use disjoint_version_decls and
to pass through conflicting_version argument.
(maybe_version_functions): Updated to use
disjoint_version_decls instead of common_function_version hook.
(duplicate_decls): Add logic to handle conflicting unmergable decls
and improve diagnostics for conflicting versions.
* decl2.cc (check_classfn): Updated to use
disjoint_version_decls instead of common_function_version hook.
|
|
This patch introduces the TARGET_CHECK_TARGET_CLONE_VERSION hook
which is used to determine if a target_clones version string parses.
The hook has a flag to enable emitting diagnostics.
This is as specified in the Arm C Language Extension. The purpose of this
is to be able to ignore invalid versions to allow some portability of code
using target_clones attributes.
Currently this is only properly implemented for the Aarch64 backend.
For riscv which is the only other backend which uses target_version
semantics a partial implementation is present, where this hook is used
to check parsing, in which errors will be emitted on a failed parse
rather than warnings. A refactor of the riscv parsing logic would be
required to enable this functionality fully.
This fixes PR 118339 where parse failures could cause ICE in Aarch64.
gcc/ChangeLog:
PR target/118339
* target.def: Add check_target_clone_version hook.
* tree.cc (get_clone_attr_versions): Add filter argument.
(get_clone_versions): Add filter argument.
* tree.h (get_clone_attr_versions): Add filter.
(get_clone_versions): Add filter argument.
* config/aarch64/aarch64.cc (aarch64_check_target_clone_version):
New function
(TARGET_CHECK_TARGET_CLONE_VERSION): New define.
* config/riscv/riscv.cc (riscv_check_target_clone_version):
New function.
(TARGET_CHECK_TARGET_CLONE_VERSION): New define.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in: Add documentation for new hook.
* hooks.h (hook_stringslice_locationtptr_true): New function.
* hooks.cc (hook_stringslice_locationtptr_true): New function.
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_target_clones_attribute): Update to use new hook.
|
|
gcc:
* target.def (dtors_from_cxa_atexit): Properly mark up
__cxa_atexit as code.
* doc/tm.texi: Regenerate.
|
|
gcc/Changelog:
* asan.h (HWASAN_TAG_SIZE): Use targetm.memtag.tag_bitsize.
* config/i386/i386.cc (ix86_memtag_tag_size): Rename to
ix86_memtag_tag_bitsize.
(TARGET_MEMTAG_TAG_SIZE): Renamed to TARGET_MEMTAG_TAG_BITSIZE.
* doc/tm.texi (TARGET_MEMTAG_TAG_SIZE): Likewise.
* doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE): Likewise.
* target.def (tag_size): Rename to tag_bitsize.
* targhooks.cc (default_memtag_tag_size): Rename to
default_memtag_tag_bitsize.
* targhooks.h (default_memtag_tag_size): Likewise.
Signed-off-by: Claudiu Zissulescu <claudiu.zissulescu-ianculescu@oracle.com>
Co-authored-by: Claudiu Zissulescu <claudiu.zissulescu-ianculescu@oracle.com>
|
|
For AMD GCN, the instructions available for loading/storing vectors are
always scatter/gather operations (i.e. there are separate addresses for
each vector lane), so the current heuristic to avoid gather/scatter
operations with too many elements in get_group_load_store_type is
counterproductive. Avoiding such operations in that function can
subsequently lead to a missed vectorization opportunity whereby later
analyses in the vectorizer try to use a very wide array type which is
not available on this target, and thus it bails out.
This patch adds a target hook to override the "single_element_p"
heuristic in the function as a target hook, and activates it for GCN. This
allows much better code to be generated for affected loops.
Co-authored-by: Julian Brown <julian@codesourcery.com>
gcc/
* doc/tm.texi.in (TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Add
documentation hook.
* doc/tm.texi: Regenerate.
* target.def (prefer_gather_scatter): Add target hook under vectorizer.
* hooks.cc (hook_bool_mode_int_unsigned_false): New function.
* hooks.h (hook_bool_mode_int_unsigned_false): New prototype.
* tree-vect-stmts.cc (vect_use_strided_gather_scatters_p): Add
parameters group_size and single_element_p, and rework to use
targetm.vectorize.prefer_gather_scatter.
(get_group_load_store_type): Move some of the condition into
vect_use_strided_gather_scatters_p.
* config/gcn/gcn.cc (gcn_prefer_gather_scatter): New function.
(TARGET_VECTORIZE_PREFER_GATHER_SCATTER): Define hook.
|
|
This patch adds an is_gather_scatter argument to the
support_vector_misalignment hook. All targets but riscv do not care
about alignment for gather/scatter so return true for is_gather_scatter.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (aarch64_builtin_support_vector_misalignment):
Return true for gather/scatter.
* config/arm/arm.cc (arm_builtin_support_vector_misalignment):
Ditto.
* config/epiphany/epiphany.cc (epiphany_support_vector_misalignment):
Ditto.
* config/gcn/gcn.cc (gcn_vectorize_support_vector_misalignment):
Ditto.
* config/loongarch/loongarch.cc (loongarch_builtin_support_vector_misalignment):
Ditto.
* config/riscv/riscv.cc (riscv_support_vector_misalignment):
Add gather/scatter argument.
* config/rs6000/rs6000.cc (rs6000_builtin_support_vector_misalignment):
Return true for gather/scatter.
* config/s390/s390.cc (s390_support_vector_misalignment):
Ditto.
* doc/tm.texi: Add argument.
* target.def: Ditto.
* targhooks.cc (default_builtin_support_vector_misalignment):
Ditto.
* targhooks.h (default_builtin_support_vector_misalignment):
Ditto.
* tree-vect-data-refs.cc (vect_supportable_dr_alignment):
Ditto.
|
|
Since TARGET_PROMOTE_FUNCTION_RETURN is no longer used, remove its
reference from target.def.
PR target/119985
* target.def: Remove TARGET_PROMOTE_FUNCTION_RETURN reference.
* doc/tm.texi: Regenerated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
Following on from the discussion in:
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html
this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and
replaces it with two hooks: one that controls the cost of using an
extra callee-saved register and one that controls the cost of allocating
a frame for the first spill.
(The patch does not attempt to address the shrink-wrapping part of
the thread above.)
On AArch64, this is enough to fix PR117477, as verified by the new tests.
The patch does not change the SPEC2017 scores significantly. (I saw a
slight improvement in fotonik3d and roms, but I'm not convinced that
the improvements are real.)
The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c,
which is a scan-dump correctness test that relies on not using
caller saves. The decision to use caller saves looks appropriate,
and saves an instruction, so I've just added -fno-caller-saves
to the test options.
The x86 parts were written by Honza. ix86_callee_save_cost is updated
by H.J. to replace gcc_checking_assert with returning 1 if mem_cost <= 2.
gcc/
PR rtl-optimization/117477
* config/aarch64/aarch64.cc (aarch64_count_saves): New function.
(aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost)
(aarch64_frame_allocation_cost): Likewise.
(TARGET_CALLEE_SAVE_COST): Define.
(TARGET_FRAME_ALLOCATION_COST): Likewise.
* config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
Replace with...
(ix86_callee_save_cost): ...this new hook.
(TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST): Define.
* target.h (spill_cost_type, frame_cost_type): New enums.
* target.def (callee_save_cost, frame_allocation_cost): New hooks.
(ira_callee_saved_register_cost_scale): Delete.
* doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks.
* doc/tm.texi: Regenerate.
* hard-reg-set.h (hard_reg_set_popcount): New function.
* ira-color.cc (allocated_memory_p): New variable.
(allocated_callee_save_regs): Likewise.
(record_allocation): New function.
(assign_hard_reg): Use targetm.frame_allocation_cost to model
the cost of the first spill or first caller save. Use
targetm.callee_save_cost to model the cost of using new callee-saved
registers. Apply the exit rather than entry frequency to the cost
of restoring a register or deallocating the frame. Update the
new variables above.
(improve_allocation): Use record_allocation.
(color): Initialize allocated_callee_save_regs.
(ira_color): Initialize allocated_memory_p.
* targhooks.h (default_callee_save_cost): Declare.
(default_frame_allocation_cost): Likewise.
* targhooks.cc (default_callee_save_cost): New function.
(default_frame_allocation_cost): Likewise.
gcc/testsuite/
PR rtl-optimization/117477
* gcc.target/aarch64/callee_save_1.c: New test.
* gcc.target/aarch64/callee_save_2.c: Likewise.
* gcc.target/aarch64/callee_save_3.c: Likewise.
* gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves.
Co-authored-by: Jan Hubicka <hubicka@ucw.cz>
Co-authored-by: H.J. Lu <hjl.tools@gmail.com>
|
|
This reverts commit e836d80374aa03a5ea5bd6cca00d826020c461da.
|
|
Following on from the discussion in:
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675256.html
this patch removes TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE and
replaces it with two hooks: one that controls the cost of using an
extra callee-saved register and one that controls the cost of allocating
a frame for the first spill.
(The patch does not attempt to address the shrink-wrapping part of
the thread above.)
On AArch64, this is enough to fix PR117477, as verified by the new tests.
The patch does not change the SPEC2017 scores significantly. (I saw a
slight improvement in fotonik3d and roms, but I'm not convinced that
the improvements are real.)
The patch makes IRA use caller saves for gcc.target/aarch64/pr103350-1.c,
which is a scan-dump correctness test that relies on not using
caller saves. The decision to use caller saves looks appropriate,
and saves an instruction, so I've just added -fno-caller-saves
to the test options.
The x86 parts were written by Honza.
gcc/
PR rtl-optimization/117477
* config/aarch64/aarch64.cc (aarch64_count_saves): New function.
(aarch64_count_above_hard_fp_saves, aarch64_callee_save_cost)
(aarch64_frame_allocation_cost): Likewise.
(TARGET_CALLEE_SAVE_COST): Define.
(TARGET_FRAME_ALLOCATION_COST): Likewise.
* config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
Replace with...
(ix86_callee_save_cost): ...this new hook.
(TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST): Define.
* target.h (spill_cost_type, frame_cost_type): New enums.
* target.def (callee_save_cost, frame_allocation_cost): New hooks.
(ira_callee_saved_register_cost_scale): Delete.
* doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Delete.
(TARGET_CALLEE_SAVE_COST, TARGET_FRAME_ALLOCATION_COST): New hooks.
* doc/tm.texi: Regenerate.
* hard-reg-set.h (hard_reg_set_popcount): New function.
* ira-color.cc (allocated_memory_p): New variable.
(allocated_callee_save_regs): Likewise.
(record_allocation): New function.
(assign_hard_reg): Use targetm.frame_allocation_cost to model
the cost of the first spill or first caller save. Use
targetm.callee_save_cost to model the cost of using new callee-saved
registers. Apply the exit rather than entry frequency to the cost
of restoring a register or deallocating the frame. Update the
new variables above.
(improve_allocation): Use record_allocation.
(color): Initialize allocated_callee_save_regs.
(ira_color): Initialize allocated_memory_p.
* targhooks.h (default_callee_save_cost): Declare.
(default_frame_allocation_cost): Likewise.
* targhooks.cc (default_callee_save_cost): New function.
(default_frame_allocation_cost): Likewise.
gcc/testsuite/
PR rtl-optimization/117477
* gcc.target/aarch64/callee_save_1.c: New test.
* gcc.target/aarch64/callee_save_2.c: Likewise.
* gcc.target/aarch64/callee_save_3.c: Likewise.
* gcc.target/aarch64/pr103350-1.c: Add -fno-caller-saves.
Co-authored-by: Jan Hubicka <hubicka@ucw.cz>
|
|
commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala <jskumari@linux.ibm.com>
Date: Tue Jun 25 08:37:49 2024 -0500
ira: Scale save/restore costs of callee save registers with block frequency
scales the cost of saving/restoring a callee-save hard register in epilogue
and prologue with the entry block frequency, which, if not optimizing for
size, is 10000, for all targets. As the result, callee-saved registers
may not be used to preserve local variable values across calls on some
targets, like x86. Add a target hook for the callee-saved register cost
scale in epilogue and prologue used by IRA. The default version of this
target hook returns 1 if optimizing for size, otherwise returns the entry
block frequency. Add an x86 version of this target hook to restore the
old behavior prior to the above commit.
PR rtl-optimization/111673
PR rtl-optimization/115932
PR rtl-optimization/116028
PR rtl-optimization/117081
PR rtl-optimization/117082
PR rtl-optimization/118497
* ira-color.cc (assign_hard_reg): Call the target hook for the
callee-saved register cost scale in epilogue and prologue.
* target.def (ira_callee_saved_register_cost_scale): New target
hook.
* targhooks.cc (default_ira_callee_saved_register_cost_scale):
New.
* targhooks.h (default_ira_callee_saved_register_cost_scale):
Likewise.
* config/i386/i386.cc (ix86_ira_callee_saved_register_cost_scale):
New.
(TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE): Likewise.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in (TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE):
New.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
|
|
This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
might match. This will cause type errors later on.
Other targets do not currently need to use this argument.
gcc/ChangeLog:
PR target/96342
* target.def (TARGET_SIMD_CLONE_USABLE): Add argument.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass stmt_info to
call TARGET_SIMD_CLONE_USABLE.
* config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add argument
and use it to reject the use of SVE simd clones with Advanced SIMD
modes.
* config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused argument.
* config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
* doc/tm.texi: Regenerate
Co-authored-by: Victor Do Nascimento <victor.donascimento@arm.com>
Co-authored-by: Tamar Christina <tamar.christina@arm.com>
|
|
This commit newly introduces the ability to use overloaded builtins in
C++ SFINAE context.
The goal behind this is in order to ensure there is a single mechanism
that libstdc++ can use to determine whether a given type can be used in
the atomic fetch_add (and similar) builtins. I am working on another
patch that hopes to use this mechanism to identify whether fetch_add
(and similar) work on floating point types.
Current state of the world:
GCC currently exposes resolved versions of these builtins to the
user, so for GCC it's currently possible to use tests similar to the
below to check for atomic loads on a 2 byte sized object.
#if __has_builtin(__atomic_load_2)
Clang does not expose resolved versions of the atomic builtins.
clang currently allows SFINAE on builtins, so that C++ code can
check whether a builtin is available on a given type.
GCC does not (and that is what this patch aims to change).
C libraries like libatomic can check whether a given atomic builtin
can work on a given type by using autoconf to check for a
miscompilation when attempting such a use.
My goal:
I would like to enable floating point fetch_add (and similar) in
GCC, in order to use those overloads in libstdc++ implementation of
atomic<float>::fetch_add.
This should allow compilers targeting GPU's which have floating
point fetch_add instructions to emit optimal code.
In order to do that I need some consistent mechanism that libstdc++
can use to identify whether the fetch_add builtins have floating
point overloads (and for which types these exist).
I would hence like to enable SFINAE on builtins, so that libstdc++
can use that mechanism for the floating point fetch_add builtins.
Implementation follows the existing mechanism for handling SFINAE
contexts in c-common.cc. A boolean is passed into the c-common.cc
function indicating whether these functions should emit errors or not.
This boolean comes from `complain & tf_error` in the C++ frontend.
(Similar to other functions like valid_array_size_p and
c_build_vec_perm_expr).
This is done both for resolve_overloaded_builtin and
check_builtin_function_arguments, both of which can be used in SFINAE
contexts.
I attempted to trigger something using the `reject_gcc_builtin`
function in an SFINAE context. Given the context where this
function is called from the C++ frontend it looks like it may be
possible, but I did not manage to trigger this in template context
by attempting to do something similar to the testcases added around
those calls.
- I would appreciate any feedback on whether this is something that
can happen in a template context, and if so some help writing a
relevant testcase for it.
Both of these functions have target hooks for target specific builtins
that I have updated to take the extra boolean flag. I have not adjusted
the functions implementing those target hooks (except to update the
declarations) so target specific builtins will still error in SFINAE
contexts.
- I could imagine not updating the target hook definition since nothing
would use that change. However I figure that allowing targets to
decide this behaviour would be the right thing to do eventually, and
since this is the target-independent part of the change to do that
this patch should make that change.
Could adjust if others disagree.
Other relevant points that I'd appreciate reviewers check:
- I did not pass this new flag through
atomic_bitint_fetch_using_cas_loop since the _BitInt type is not
available in the C++ frontend and I didn't want if conditions that can
not be executed in the source.
- I only test non-compile-time-constant types with SVE types, since I do
not know of a way to get a VLA into a SFINAE context.
- While writing tests I noticed a few differences with clang in this
area. I don't think they are problematic but am mentioning them for
completeness and to allow others to judge if these are a problem).
- atomic_fetch_add on a boolean is allowed by clang.
- When __atomic_load is passed an invalid memory model (i.e. too
large), we give an SFINAE failure while clang does not.
Bootstrap and regression tested on AArch64 and x86_64.
Built first stage on targets whose target hook declaration needed
updated (though did not regtest etc). Targets triplets I built in order
to check the backend specific changes I made:
- arm-none-linux-gnueabihf
- avr-linux-gnu
- riscv-linux-gnu
- powerpc-linux-gnu
- s390x-linux-gnu
Ok for commit to trunk?
gcc/c-family/ChangeLog:
* c-common.cc (builtin_function_validate_nargs,
check_builtin_function_arguments,
speculation_safe_value_resolve_call,
speculation_safe_value_resolve_params, sync_resolve_size,
sync_resolve_params, get_atomic_generic_size,
resolve_overloaded_atomic_exchange,
resolve_overloaded_atomic_compare_exchange,
resolve_overloaded_atomic_load, resolve_overloaded_atomic_store,
resolve_overloaded_builtin): Add `complain` boolean parameter
and determine whether to emit errors based on its value.
* c-common.h (check_builtin_function_arguments,
resolve_overloaded_builtin): Mention `complain` boolean
parameter in declarations. Give it a default of `true`.
gcc/ChangeLog:
* config/aarch64/aarch64-c.cc
(aarch64_resolve_overloaded_builtin,aarch64_check_builtin_call):
Add new unused boolean parameter to match target hook
definition.
* config/arm/arm-builtins.cc (arm_check_builtin_call): Likewise.
* config/arm/arm-c.cc (arm_resolve_overloaded_builtin):
Likewise.
* config/arm/arm-protos.h (arm_check_builtin_call): Likewise.
* config/avr/avr-c.cc (avr_resolve_overloaded_builtin):
Likewise.
* config/riscv/riscv-c.cc (riscv_check_builtin_call,
riscv_resolve_overloaded_builtin): Likewise.
* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
Likewise.
* config/rs6000/rs6000-protos.h (altivec_resolve_overloaded_builtin):
Likewise.
* config/s390/s390-c.cc (s390_resolve_overloaded_builtin):
Likewise.
* doc/tm.texi: Regenerate.
* target.def (TARGET_RESOLVE_OVERLOADED_BUILTIN,
TARGET_CHECK_BUILTIN_CALL): Update prototype to include a
boolean parameter that indicates whether errors should be
emitted. Update documentation to mention this fact.
gcc/cp/ChangeLog:
* call.cc (build_cxx_call): Pass `complain` parameter to
check_builtin_function_arguments. Take its value from the
`tsubst_flags_t` type `complain & tf_error`.
* semantics.cc (finish_call_expr): Pass `complain` parameter to
resolve_overloaded_builtin. Take its value from the
`tsubst_flags_t` type `complain & tf_error`.
gcc/testsuite/ChangeLog:
* g++.dg/template/builtin-atomic-overloads.def: New test.
* g++.dg/template/builtin-atomic-overloads1.C: New test.
* g++.dg/template/builtin-atomic-overloads2.C: New test.
* g++.dg/template/builtin-atomic-overloads3.C: New test.
* g++.dg/template/builtin-atomic-overloads4.C: New test.
* g++.dg/template/builtin-atomic-overloads5.C: New test.
* g++.dg/template/builtin-atomic-overloads6.C: New test.
* g++.dg/template/builtin-atomic-overloads7.C: New test.
* g++.dg/template/builtin-atomic-overloads8.C: New test.
* g++.dg/template/builtin-sfinae-check-function-arguments.C: New test.
* g++.dg/template/builtin-speculation-overloads.def: New test.
* g++.dg/template/builtin-speculation-overloads1.C: New test.
* g++.dg/template/builtin-speculation-overloads2.C: New test.
* g++.dg/template/builtin-speculation-overloads3.C: New test.
* g++.dg/template/builtin-speculation-overloads4.C: New test.
* g++.dg/template/builtin-speculation-overloads5.C: New test.
* g++.dg/template/builtin-validate-nargs.C: New test.
Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
|
|
This is v2 of the patch; v1 was here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655541.html
Changed in v2:
* added a new TARGET_DOCUMENTATION_NAME hook for figuring out which
documentation URL to use when there are multiple per-target docs,
such as for __attribute__((interrupt)); implemented this for all
targets that have target-specific attributes
* moved attribute_urlifier and its support code to a new
gcc-attribute-urlifier.cc since it needs to use targetm for the
above; gcc-urlifier.o is used by the driver.
* fixed extend.texi so that some attributes that failed to appear in
attr-urls.def now do so (affected nvptx "kernel" and "shared" attrs)
* regenerated attr-urls.def for the above fix, and bringing in
attributes added since v1 of the patch
In r14-5118-gc5db4d8ba5f3de I added a mechanism to automatically add
documentation URLs to quoted strings in diagnostics.
In r14-6920-g9e49746da303b8 I added a mechanism to generate URLs for
mentions of command-line options in quoted strings in diagnostics.
This patch does a similar thing for attributes. It adds a new Python 3
script to scrape the generated HTML looking for documentation of
attributes, and uses this to (re)generate a new gcc/attr-urls.def file.
Running "make regenerate-attr-urls" after rebuilding the HTML docs will
regenerate gcc/attr-urls.def in the source directory.
The patch uses this to optionally add doc URLs for attributes in any
diagnostic emitted during the lifetime of a auto_urlify_attributes
instance, and adds such instances everywhere that a diagnostic refers
to a diagnostic within quotes (based on grepping the source tree
for references to attributes in strings and in code).
For example, given:
$ ./xgcc -B. -S ../../src/gcc/testsuite/gcc.dg/attr-access-2.c
../../src/gcc/testsuite/gcc.dg/attr-access-2.c:14:16: warning:
attribute ‘access(read_write, 2, 3)’ positional argument 2 conflicts
with previous designation by argument 1 [-Wattributes]
with this patch the quoted text `access(read_write, 2, 3)'
automatically gains the URL for our docs for "access":
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-access-function-attribute
in a sufficiently modern terminal.
Like r14-6920-g9e49746da303b8 this avoids the Makefile target
depending on the generated HTML, since a missing URL is a minor
problem, whereas requiring all users to build HTML docs seems more
involved. Doing so also avoids Python 3 as a build requirement for
everyone, but instead just for developers addding attributes.
Like the options, we could add a CI test for this.
The patch gathers both general and target-specific attributes.
For example, the function attribute "interrupt" has 19 URLs within our
docs: one common, and 18 target-specific ones.
The patch adds a new target hook used when selecting the most
appropriate one.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/ChangeLog:
* Makefile.in (OBJS): Add -attribute-urlifier.o.
(ATTR_URLS_HTML_DEPS): New.
(regenerate-attr-urls): New.
(regenerate-attr-urls-unit-test): New.
* attr-urls.def: New file.
* attribs.cc: Include "gcc-urlifier.h".
(decl_attributes): Use auto_urlify_attributes.
* config/aarch64/aarch64.cc (TARGET_DOCUMENTATION_NAME): New.
* config/arc/arc.cc (TARGET_DOCUMENTATION_NAME): New.
* config/arm/arm.cc (TARGET_DOCUMENTATION_NAME): New.
* config/bfin/bfin.cc (TARGET_DOCUMENTATION_NAME): New.
* config/bpf/bpf.cc (TARGET_DOCUMENTATION_NAME): New.
* config/epiphany/epiphany.cc (TARGET_DOCUMENTATION_NAME): New.
* config/gcn/gcn.cc (TARGET_DOCUMENTATION_NAME): New.
* config/h8300/h8300.cc (TARGET_DOCUMENTATION_NAME): New.
* config/i386/i386.cc (TARGET_DOCUMENTATION_NAME): New.
* config/ia64/ia64.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m32c/m32c.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m32r/m32r.cc (TARGET_DOCUMENTATION_NAME): New.
* config/m68k/m68k.cc (TARGET_DOCUMENTATION_NAME): New.
* config/mcore/mcore.cc (TARGET_DOCUMENTATION_NAME): New.
* config/microblaze/microblaze.cc (TARGET_DOCUMENTATION_NAME):
New.
* config/mips/mips.cc (TARGET_DOCUMENTATION_NAME): New.
* config/msp430/msp430.cc (TARGET_DOCUMENTATION_NAME): New.
* config/nds32/nds32.cc (TARGET_DOCUMENTATION_NAME): New.
* config/nvptx/nvptx.cc (TARGET_DOCUMENTATION_NAME): New.
* config/riscv/riscv.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rl78/rl78.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rs6000/rs6000.cc (TARGET_DOCUMENTATION_NAME): New.
* config/rx/rx.cc (TARGET_DOCUMENTATION_NAME): New.
* config/s390/s390.cc (TARGET_DOCUMENTATION_NAME): New.
* config/sh/sh.cc (TARGET_DOCUMENTATION_NAME): New.
* config/stormy16/stormy16.cc (TARGET_DOCUMENTATION_NAME): New.
* config/v850/v850.cc (TARGET_DOCUMENTATION_NAME): New.
* config/visium/visium.cc (TARGET_DOCUMENTATION_NAME): New.
gcc/analyzer/ChangeLog:
* region-model.cc: Include "gcc-urlifier.h".
(reason_attr_access::emit): Use auto_urlify_attributes.
* sm-taint.cc: Include "gcc-urlifier.h".
(tainted_access_attrib_size::emit): Use auto_urlify_attributes.
gcc/c-family/ChangeLog:
* c-attribs.cc: Include "gcc-urlifier.h".
(positional_argument): Use auto_urlify_attributes.
* c-common.cc: Include "gcc-urlifier.h".
(parse_optimize_options): Use auto_urlify_attributes with
OPT_Wattributes.
(attribute_fallthrough_p): Use auto_urlify_attributes.
* c-warn.cc: Include "gcc-urlifier.h".
(diagnose_mismatched_attributes): Use auto_urlify_attributes.
gcc/c/ChangeLog:
* c-decl.cc: Include "gcc-urlifier.h".
(start_decl): Use auto_urlify_attributes with OPT_Wattributes.
(start_function): Likewise.
* c-parser.cc: Include "gcc-urlifier.h".
(c_parser_statement_after_labels): Use auto_urlify_attributes with
OPT_Wattributes.
* c-typeck.cc: Include "gcc-urlifier.h".
(maybe_warn_nodiscard): Use auto_urlify_attributes with
OPT_Wunused_result.
gcc/cp/ChangeLog:
* cp-gimplify.cc: Include "gcc-urlifier.h".
(process_stmt_hotness_attribute): Use auto_urlify_attributes with
OPT_Wattributes.
* cvt.cc: Include "gcc-urlifier.h".
(maybe_warn_nodiscard): Use auto_urlify_attributes with
OPT_Wunused_result.
* decl.cc: Include "gcc-urlifier.h".
(start_decl): Use auto_urlify_attributes.
(start_preparsed_function): Likewise.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::override_urlifier): New.
* diagnostic.h (diagnostic_context::override_urlifier): New decl.
* doc/extend.texi (Nvidia PTX Function Attributes): Update
@cindex to specify that "kernel" is a function attribute and
"shared" is a variable attribute, so that these entries are
recognized by the regex in regenerate-attr-urls.py.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_DOCUMENTATION_NAME): New.
* gcc-attribute-urlifier.cc: New file.
* gcc-urlifier.cc: Include diagnostic.h.
(gcc_urlifier::make_doc): Convert to...
(make_doc_url): ...this.
(auto_override_urlifier::auto_override_urlifier): New.
(auto_override_urlifier::~auto_override_urlifier): New.
(selftest::gcc_urlifier_cc_tests): Split out body into...
(selftest::test_gcc_urlifier): ...this.
* gcc-urlifier.h: Include "pretty-print-urlifier.h" and "label-text.h".
(make_doc_url): New decl.
(class auto_override_urlifier): New.
(class attribute_urlifier): New.
(class auto_urlify_attributes): New.
* gimple-ssa-warn-access.cc: Include "gcc-urlifier.h".
(pass_waccess::execute): Use auto_urlify_attributes.
* gimplify.cc: Include "gcc-urlifier.h".
(expand_FALLTHROUGH): Use auto_urlify_attributes.
* internal-fn.cc: Define INCLUDE_MEMORY and include
"gcc-urlifier.h.
(expand_FALLTHROUGH): Use auto_urlify_attributes.
* ipa-pure-const.cc: Include "gcc-urlifier.h.
(suggest_attribute): Use auto_urlify_attributes.
* ipa-strub.cc: Include "gcc-urlifier.h.
(can_strub_p): Use auto_urlify_attributes.
* regenerate-attr-urls.py: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
gcc_attribute_urlifier_cc_tests.
* selftest.h (selftest::gcc_attribute_urlifier_cc_tests): New
decl.
* target.def (documentation_name): New DEFHOOKPOD.
* tree-cfg.cc: Include "gcc-urlifier.h.
(do_warn_unused_result): Use auto_urlify_attributes.
* tree-ssa-uninit.cc: Include "gcc-urlifier.h.
(maybe_warn_read_write_only): Use auto_urlify_attributes.
(maybe_warn_pass_by_reference): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
A few targets have been using "unsigned int" function arguments that need to
receive a "location_t". Change to "location_t" to prepare for the
possibility that location_t can be configured to be a different type.
gcc/ChangeLog:
* config/aarch64/aarch64-c.cc (aarch64_resolve_overloaded_builtin):
Change "unsigned int" argument to "location_t".
* config/avr/avr-c.cc (avr_resolve_overloaded_builtin): Likewise.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Likewise.
* target.def: Likewise.
* doc/tm.texi: Regenerate.
|
|
The following patch adds a "redzone" clobber (recognized everywhere,
even on on targets which don't do anything with it),
with which one can mark the rare case where inline asm pushes
something on the stack or uses call instruction without taking
red zone into account (i.e. addq $-128, %rsp; and addq $128, %rsp
around that).
2024-11-28 Jakub Jelinek <jakub@redhat.com>
gcc/
* target.def (redzone_clobber): New target hook.
* varasm.cc (decode_reg_name_and_count): Return -5 for
"redzone".
* cfgexpand.cc (expand_asm_stmt): Handle redzone clobber.
* config/i386/i386.h (struct machine_function): Add
asm_redzone_clobber_seen member.
* config/i386/i386.cc (ix86_compute_frame_layout): Don't
use red zone if cfun->machine->asm_redzone_clobber_seen.
(ix86_redzone_clobber): New function.
(TARGET_REDZONE_CLOBBER): Redefine.
* doc/extend.texi (Clobbers and Scratch Registers): Document
the "redzone" clobber.
* doc/tm.texi.in: Add @hook TARGET_REDZONE_CLOBBER.
* doc/tm.texi: Regenerate.
gcc/testsuite/
* gcc.dg/asm-redzone-1.c: New test.
* gcc.target/i386/asm-redzone-1.c: New test.
|
|
AddressSanitizer has supported dynamic shadow offsets since 2016[1], but
GCC hasn't implemented this yet because targets using dynamic shadow
offsets, such as Fuchsia and iOS, are mostly unsupported in GCC.
However, RISC-V 64 switched to dynamic shadow offsets this year[2] because
virtual memory space support varies across different RISC-V cores, such as
Sv39, Sv48, and Sv57. We realized that the best way to handle this
situation is by using a dynamic shadow offset to obtain the offset at
runtime.
We introduce a new target hook, TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P, to
determine if the target is using a dynamic shadow offset, so this change
won't affect the static offset path. Additionally, TARGET_ASAN_SHADOW_OFFSET
continues to work even if TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P is non-zero,
ensuring that KASAN functions as expected.
This patch set has been verified on the Banana Pi F3, currently one of the
most popular RISC-V development boards. All AddressSanitizer-related tests
passed without introducing new regressions.
It was also verified on AArch64 and x86_64 with no regressions in
AddressSanitizer.
[1] https://github.com/llvm/llvm-project/commit/130a190bf08a3d955d9db24dac936159dc049e12
[2] https://github.com/llvm/llvm-project/commit/da0c8b275564f814a53a5c19497669ae2d99538d
gcc/ChangeLog:
* asan.cc (asan_dynamic_shadow_offset_p): New.
(asan_shadow_memory_dynamic_address): New.
(asan_local_shadow_memory_dynamic_address): New.
(get_asan_shadow_memory_dynamic_address_decl): New.
(asan_maybe_insert_dynamic_shadow_at_function_entry): New.
(asan_emit_stack_protection): Support dynamic shadow offset.
(build_shadow_mem_access): Ditto.
* asan.h (asan_maybe_insert_dynamic_shadow_at_function_entry): New.
* doc/tm.texi (TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P): New.
* doc/tm.texi.in (TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P): Ditto.
* sanopt.cc (pass_sanopt::execute): Handle dynamic shadow offset.
* target.def (asan_dynamic_shadow_offset_p): New.
* toplev.cc (process_options): Handle dynamic shadow offset.
|
|
This pass detects cases of expensive store forwarding and tries to
avoid them by reordering the stores and using suitable bit insertion
sequences. For example it can transform this:
strb w2, [x1, 1]
ldr x0, [x1] # Expensive store forwarding to larger load.
To:
ldr x0, [x1]
strb w2, [x1]
bfi x0, x2, 0, 8
Assembly like this can appear with bitfields or type punning / unions.
On stress-ng when running the cpu-union microbenchmark the following
speedups have been observed.
Neoverse-N1: +29.4%
Intel Coffeelake: +13.1%
AMD 5950X: +17.5%
The transformation is rejected on cases that cause store_bit_field to
generate subreg expressions on different register classes. Files
avoid-store-forwarding-4.c and avoid-store-forwarding-5.c contain such
cases and have been marked as XFAIL.
Due to biasing of its operands in store_bit_field, there is a special
handling for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. The
need for this was exosed by an issue exposed on the H8 architecture,
which uses big-endian ordering, but BITS_BIG_ENDIAN is false. In that
case, the START parameter of store_bit_field needs to be calculated
from the end of the destination register.
gcc/ChangeLog:
* Makefile.in (OBJS): Add avoid-store-forwarding.o.
* common.opt (favoid-store-forwarding): New option.
* common.opt.urls: Regenerate.
* doc/invoke.texi: New param store-forwarding-max-distance.
* doc/passes.texi: Document new pass.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document new pass.
* params.opt (store-forwarding-max-distance): New param.
* passes.def: Add pass_rtl_avoid_store_forwarding before
pass_early_remat.
* target.def (avoid_store_forwarding_p): New DEFHOOK.
* target.h (struct store_fwd_info): Declare.
* targhooks.cc (default_avoid_store_forwarding_p): New function.
* targhooks.h (default_avoid_store_forwarding_p): Declare.
* tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
* avoid-store-forwarding.cc: New file.
* avoid-store-forwarding.h: New file.
* timevar.def (TV_AVOID_STORE_FORWARDING): New timevar.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/avoid-store-forwarding-1.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-2.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-3.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-4.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-5.c: New test.
* gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c: New test.
* gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c: New test.
Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Konstantinos Eleftheriou <konstantinos.eleftheriou@vrull.eu>
|
|
I've tried to build stage3 with
-Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank
added to STRICT_WARN and that expectably resulted in about
2744 unique trailing whitespace warnings and 124837 leading whitespace
warnings when excluding *.md files (which obviously is in big part a
generator issue). Others from that are generator related, I think those
need to be solved later.
The following patch just fixes up the easy case (trailing whitespace),
which could be easily automated:
for i in `find . -name \*.h -o -name \*.cc -o -name \*.c | xargs grep -l '[ ]$' | grep -v testsuite/`; do sed -i -e 's/[ ]*$//' $i; done
I've excluded files which I knew are obviously generated or go FE.
Is there anything else we'd want to avoid the changes?
Due to patch size, I've split it between gcc/ part (this patch)
and rest (include/, libiberty/, libgcc/, libcpp/, libstdc++-v3/).
2024-10-24 Jakub Jelinek <jakub@redhat.com>
gcc/
* lra-assigns.cc: Remove trailing whitespace.
* symtab.cc: Likewise.
* stmt.cc: Likewise.
* cgraphbuild.cc: Likewise.
* cfgcleanup.cc: Likewise.
* loop-init.cc: Likewise.
* df-problems.cc: Likewise.
* diagnostic-macro-unwinding.cc: Likewise.
* langhooks.h: Likewise.
* except.cc: Likewise.
* tree-vect-loop.cc: Likewise.
* coverage.cc: Likewise.
* hash-table.cc: Likewise.
* ggc-page.cc: Likewise.
* gimple-ssa-strength-reduction.cc: Likewise.
* tree-parloops.cc: Likewise.
* internal-fn.cc: Likewise.
* ipa-split.cc: Likewise.
* calls.cc: Likewise.
* reorg.cc: Likewise.
* sbitmap.h: Likewise.
* omp-offload.cc: Likewise.
* cfgrtl.cc: Likewise.
* reginfo.cc: Likewise.
* gengtype.h: Likewise.
* omp-general.h: Likewise.
* ipa-comdats.cc: Likewise.
* gimple-range-edge.h: Likewise.
* tree-ssa-structalias.cc: Likewise.
* target.def: Likewise.
* basic-block.h: Likewise.
* graphite-isl-ast-to-gimple.cc: Likewise.
* auto-profile.cc: Likewise.
* optabs.cc: Likewise.
* gengtype-lex.l: Likewise.
* optabs.def: Likewise.
* ira-build.cc: Likewise.
* ira.cc: Likewise.
* function.h: Likewise.
* tree-ssa-propagate.cc: Likewise.
* gcov-io.cc: Likewise.
* builtin-types.def: Likewise.
* ddg.cc: Likewise.
* lra-spills.cc: Likewise.
* cfg.cc: Likewise.
* bitmap.cc: Likewise.
* gimple-range-gori.h: Likewise.
* tree-ssa-loop-im.cc: Likewise.
* cfghooks.h: Likewise.
* genmatch.cc: Likewise.
* explow.cc: Likewise.
* lto-streamer-in.cc: Likewise.
* graphite-scop-detection.cc: Likewise.
* ipa-prop.cc: Likewise.
* gcc.cc: Likewise.
* vec.h: Likewise.
* cfgexpand.cc: Likewise.
* config/alpha/vms.h: Likewise.
* config/alpha/alpha.cc: Likewise.
* config/alpha/driver-alpha.cc: Likewise.
* config/alpha/elf.h: Likewise.
* config/iq2000/iq2000.h: Likewise.
* config/iq2000/iq2000.cc: Likewise.
* config/pa/pa-64.h: Likewise.
* config/pa/som.h: Likewise.
* config/pa/pa.cc: Likewise.
* config/pa/pa.h: Likewise.
* config/pa/pa32-regs.h: Likewise.
* config/c6x/c6x.cc: Likewise.
* config/openbsd-stdint.h: Likewise.
* config/elfos.h: Likewise.
* config/lm32/lm32.cc: Likewise.
* config/lm32/lm32.h: Likewise.
* config/lm32/lm32-protos.h: Likewise.
* config/darwin-c.cc: Likewise.
* config/rx/rx.cc: Likewise.
* config/host-darwin.h: Likewise.
* config/netbsd.h: Likewise.
* config/ia64/ia64.cc: Likewise.
* config/ia64/freebsd.h: Likewise.
* config/avr/avr-c.cc: Likewise.
* config/avr/avr.cc: Likewise.
* config/avr/avr-arch.h: Likewise.
* config/avr/avr.h: Likewise.
* config/avr/stdfix.h: Likewise.
* config/avr/gen-avr-mmcu-specs.cc: Likewise.
* config/avr/avr-log.cc: Likewise.
* config/avr/elf.h: Likewise.
* config/avr/gen-avr-mmcu-texi.cc: Likewise.
* config/avr/avr-devices.cc: Likewise.
* config/nvptx/nvptx.cc: Likewise.
* config/vx-common.h: Likewise.
* config/sol2.cc: Likewise.
* config/rl78/rl78.cc: Likewise.
* config/cris/cris.cc: Likewise.
* config/arm/symbian.h: Likewise.
* config/arm/unknown-elf.h: Likewise.
* config/arm/linux-eabi.h: Likewise.
* config/arm/arm.cc: Likewise.
* config/arm/arm-mve-builtins.h: Likewise.
* config/arm/bpabi.h: Likewise.
* config/arm/vxworks.h: Likewise.
* config/arm/arm.h: Likewise.
* config/arm/aout.h: Likewise.
* config/arm/elf.h: Likewise.
* config/host-linux.cc: Likewise.
* config/sh/sh_treg_combine.cc: Likewise.
* config/sh/vxworks.h: Likewise.
* config/sh/elf.h: Likewise.
* config/sh/netbsd-elf.h: Likewise.
* config/sh/sh.cc: Likewise.
* config/sh/embed-elf.h: Likewise.
* config/sh/sh.h: Likewise.
* config/darwin-driver.cc: Likewise.
* config/m32c/m32c.cc: Likewise.
* config/frv/frv.cc: Likewise.
* config/openbsd.h: Likewise.
* config/aarch64/aarch64-protos.h: Likewise.
* config/aarch64/aarch64-builtins.cc: Likewise.
* config/aarch64/aarch64-cost-tables.h: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/bfin/bfin.cc: Likewise.
* config/bfin/bfin.h: Likewise.
* config/bfin/bfin-protos.h: Likewise.
* config/i386/gmm_malloc.h: Likewise.
* config/i386/djgpp.h: Likewise.
* config/i386/sol2.h: Likewise.
* config/i386/stringop.def: Likewise.
* config/i386/i386-features.cc: Likewise.
* config/i386/openbsdelf.h: Likewise.
* config/i386/cpuid.h: Likewise.
* config/i386/i386.h: Likewise.
* config/i386/smmintrin.h: Likewise.
* config/i386/avx10_2-512convertintrin.h: Likewise.
* config/i386/i386-options.cc: Likewise.
* config/i386/i386-opts.h: Likewise.
* config/i386/i386-expand.cc: Likewise.
* config/i386/avx512dqintrin.h: Likewise.
* config/i386/wmmintrin.h: Likewise.
* config/i386/gnu-user.h: Likewise.
* config/i386/host-mingw32.cc: Likewise.
* config/i386/avx10_2bf16intrin.h: Likewise.
* config/i386/cygwin.h: Likewise.
* config/i386/driver-i386.cc: Likewise.
* config/i386/biarch64.h: Likewise.
* config/i386/host-cygwin.cc: Likewise.
* config/i386/cygming.h: Likewise.
* config/i386/i386-builtins.cc: Likewise.
* config/i386/avx10_2convertintrin.h: Likewise.
* config/i386/i386.cc: Likewise.
* config/i386/gas.h: Likewise.
* config/i386/freebsd.h: Likewise.
* config/mingw/winnt-cxx.cc: Likewise.
* config/mingw/winnt.cc: Likewise.
* config/h8300/h8300.cc: Likewise.
* config/host-solaris.cc: Likewise.
* config/m32r/m32r.h: Likewise.
* config/m32r/m32r.cc: Likewise.
* config/darwin.h: Likewise.
* config/sparc/linux64.h: Likewise.
* config/sparc/sparc-protos.h: Likewise.
* config/sparc/sysv4.h: Likewise.
* config/sparc/sparc.h: Likewise.
* config/sparc/linux.h: Likewise.
* config/sparc/freebsd.h: Likewise.
* config/sparc/sparc.cc: Likewise.
* config/gcn/gcn-run.cc: Likewise.
* config/gcn/gcn.cc: Likewise.
* config/gcn/gcn-tree.cc: Likewise.
* config/kopensolaris-gnu.h: Likewise.
* config/nios2/nios2.h: Likewise.
* config/nios2/elf.h: Likewise.
* config/nios2/nios2.cc: Likewise.
* config/host-netbsd.cc: Likewise.
* config/rtems.h: Likewise.
* config/pdp11/pdp11.cc: Likewise.
* config/pdp11/pdp11.h: Likewise.
* config/mn10300/mn10300.cc: Likewise.
* config/mn10300/linux.h: Likewise.
* config/moxie/moxie.h: Likewise.
* config/moxie/moxie.cc: Likewise.
* config/rs6000/aix71.h: Likewise.
* config/rs6000/vec_types.h: Likewise.
* config/rs6000/xcoff.h: Likewise.
* config/rs6000/rs6000.cc: Likewise.
* config/rs6000/rs6000-internal.h: Likewise.
* config/rs6000/rs6000-p8swap.cc: Likewise.
* config/rs6000/rs6000-c.cc: Likewise.
* config/rs6000/aix.h: Likewise.
* config/rs6000/rs6000-logue.cc: Likewise.
* config/rs6000/rs6000-string.cc: Likewise.
* config/rs6000/rs6000-call.cc: Likewise.
* config/rs6000/ppu_intrinsics.h: Likewise.
* config/rs6000/altivec.h: Likewise.
* config/rs6000/darwin.h: Likewise.
* config/rs6000/host-darwin.cc: Likewise.
* config/rs6000/freebsd64.h: Likewise.
* config/rs6000/spu2vmx.h: Likewise.
* config/rs6000/linux.h: Likewise.
* config/rs6000/si2vmx.h: Likewise.
* config/rs6000/driver-rs6000.cc: Likewise.
* config/rs6000/freebsd.h: Likewise.
* config/vxworksae.h: Likewise.
* config/mips/frame-header-opt.cc: Likewise.
* config/mips/mips.h: Likewise.
* config/mips/mips.cc: Likewise.
* config/mips/sde.h: Likewise.
* config/darwin-protos.h: Likewise.
* config/mcore/mcore-elf.h: Likewise.
* config/mcore/mcore.h: Likewise.
* config/mcore/mcore.cc: Likewise.
* config/epiphany/epiphany.cc: Likewise.
* config/fr30/fr30.h: Likewise.
* config/fr30/fr30.cc: Likewise.
* config/riscv/riscv-vector-builtins-shapes.cc: Likewise.
* config/riscv/riscv-vector-builtins-bases.cc: Likewise.
* config/visium/visium.h: Likewise.
* config/mmix/mmix.cc: Likewise.
* config/v850/v850.cc: Likewise.
* config/v850/v850-c.cc: Likewise.
* config/v850/v850.h: Likewise.
* config/stormy16/stormy16.cc: Likewise.
* config/stormy16/stormy16-protos.h: Likewise.
* config/stormy16/stormy16.h: Likewise.
* config/arc/arc.cc: Likewise.
* config/vxworks.cc: Likewise.
* config/microblaze/microblaze-c.cc: Likewise.
* config/microblaze/microblaze-protos.h: Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/microblaze/microblaze.cc: Likewise.
* config/freebsd-spec.h: Likewise.
* config/m68k/m68kelf.h: Likewise.
* config/m68k/m68k.cc: Likewise.
* config/m68k/netbsd-elf.h: Likewise.
* config/m68k/linux.h: Likewise.
* config/freebsd.h: Likewise.
* config/host-openbsd.cc: Likewise.
* regcprop.cc: Likewise.
* dumpfile.cc: Likewise.
* combine.cc: Likewise.
* tree-ssa-forwprop.cc: Likewise.
* ipa-profile.cc: Likewise.
* hw-doloop.cc: Likewise.
* opts.cc: Likewise.
* gcc-ar.cc: Likewise.
* tree-cfg.cc: Likewise.
* incpath.cc: Likewise.
* tree-ssa-sccvn.cc: Likewise.
* function.cc: Likewise.
* genattrtab.cc: Likewise.
* rtl.def: Likewise.
* genchecksum.cc: Likewise.
* profile.cc: Likewise.
* df-core.cc: Likewise.
* tree-pretty-print.cc: Likewise.
* tree.h: Likewise.
* plugin.cc: Likewise.
* tree-ssa-loop-ch.cc: Likewise.
* emit-rtl.cc: Likewise.
* haifa-sched.cc: Likewise.
* gimple-range-edge.cc: Likewise.
* range-op.cc: Likewise.
* tree-ssa-ccp.cc: Likewise.
* dwarf2cfi.cc: Likewise.
* recog.cc: Likewise.
* vtable-verify.cc: Likewise.
* system.h: Likewise.
* regrename.cc: Likewise.
* tree-ssa-dom.cc: Likewise.
* loop-unroll.cc: Likewise.
* lra-constraints.cc: Likewise.
* pretty-print.cc: Likewise.
* ifcvt.cc: Likewise.
* ipa.cc: Likewise.
* alloc-pool.h: Likewise.
* collect2.cc: Likewise.
* pointer-query.cc: Likewise.
* cfgloop.cc: Likewise.
* toplev.cc: Likewise.
* sese.cc: Likewise.
* gengtype.cc: Likewise.
* gimplify-me.cc: Likewise.
* double-int.cc: Likewise.
* bb-reorder.cc: Likewise.
* dwarf2out.cc: Likewise.
* tree-ssa-loop-ivcanon.cc: Likewise.
* tree-ssa-reassoc.cc: Likewise.
* cgraph.cc: Likewise.
* sel-sched.cc: Likewise.
* attribs.cc: Likewise.
* expr.cc: Likewise.
* tree-ssa-scopedtables.h: Likewise.
* gimple-range-cache.cc: Likewise.
* ipa-pure-const.cc: Likewise.
* tree-inline.cc: Likewise.
* genhooks.cc: Likewise.
* gimple-range-phi.h: Likewise.
* shrink-wrap.cc: Likewise.
* tree.cc: Likewise.
* gimple.cc: Likewise.
* backend.h: Likewise.
* opts-common.cc: Likewise.
* cfg-flags.def: Likewise.
* gcse-common.cc: Likewise.
* tree-ssa-scopedtables.cc: Likewise.
* ccmp.cc: Likewise.
* builtins.def: Likewise.
* builtin-attrs.def: Likewise.
* postreload.cc: Likewise.
* sched-deps.cc: Likewise.
* ipa-inline-transform.cc: Likewise.
* tree-vect-generic.cc: Likewise.
* ipa-polymorphic-call.cc: Likewise.
* builtins.cc: Likewise.
* sel-sched-ir.cc: Likewise.
* trans-mem.cc: Likewise.
* ipa-visibility.cc: Likewise.
* cgraph.h: Likewise.
* tree-ssa-phiopt.cc: Likewise.
* genopinit.cc: Likewise.
* ipa-inline.cc: Likewise.
* omp-low.cc: Likewise.
* ipa-utils.cc: Likewise.
* tree-ssa-math-opts.cc: Likewise.
* tree-ssa-ifcombine.cc: Likewise.
* gimple-range.cc: Likewise.
* ipa-fnsummary.cc: Likewise.
* ira-color.cc: Likewise.
* value-prof.cc: Likewise.
* varasm.cc: Likewise.
* ipa-icf.cc: Likewise.
* ira-emit.cc: Likewise.
* lto-streamer.h: Likewise.
* lto-wrapper.cc: Likewise.
* regs.h: Likewise.
* gengtype-parse.cc: Likewise.
* alias.cc: Likewise.
* lto-streamer.cc: Likewise.
* real.h: Likewise.
* wide-int.h: Likewise.
* targhooks.cc: Likewise.
* gimple-ssa-warn-access.cc: Likewise.
* real.cc: Likewise.
* ipa-reference.cc: Likewise.
* bitmap.h: Likewise.
* ginclude/float.h: Likewise.
* ginclude/stddef.h: Likewise.
* ginclude/stdarg.h: Likewise.
* ginclude/stdatomic.h: Likewise.
* optabs.h: Likewise.
* sel-sched-ir.h: Likewise.
* convert.cc: Likewise.
* cgraphunit.cc: Likewise.
* lra-remat.cc: Likewise.
* tree-if-conv.cc: Likewise.
* gcov-dump.cc: Likewise.
* tree-predcom.cc: Likewise.
* dominance.cc: Likewise.
* gimple-range-cache.h: Likewise.
* ipa-devirt.cc: Likewise.
* rtl.h: Likewise.
* ubsan.cc: Likewise.
* tree-ssa.cc: Likewise.
* ssa.h: Likewise.
* cse.cc: Likewise.
* jump.cc: Likewise.
* hwint.h: Likewise.
* caller-save.cc: Likewise.
* coretypes.h: Likewise.
* ipa-fnsummary.h: Likewise.
* tree-ssa-strlen.cc: Likewise.
* modulo-sched.cc: Likewise.
* cgraphclones.cc: Likewise.
* lto-cgraph.cc: Likewise.
* hw-doloop.h: Likewise.
* data-streamer.h: Likewise.
* compare-elim.cc: Likewise.
* profile-count.h: Likewise.
* tree-vect-loop-manip.cc: Likewise.
* ree.cc: Likewise.
* reload.cc: Likewise.
* tree-ssa-loop-split.cc: Likewise.
* tree-into-ssa.cc: Likewise.
* gcse.cc: Likewise.
* cfgloopmanip.cc: Likewise.
* df.h: Likewise.
* fold-const.cc: Likewise.
* wide-int.cc: Likewise.
* gengtype-state.cc: Likewise.
* sanitizer.def: Likewise.
* tree-ssa-sink.cc: Likewise.
* target-hooks-macros.h: Likewise.
* tree-ssa-pre.cc: Likewise.
* gimple-pretty-print.cc: Likewise.
* ipa-utils.h: Likewise.
* tree-outof-ssa.cc: Likewise.
* tree-ssa-coalesce.cc: Likewise.
* gimple-match.h: Likewise.
* tree-ssa-loop-niter.cc: Likewise.
* tree-loop-distribution.cc: Likewise.
* tree-emutls.cc: Likewise.
* tree-eh.cc: Likewise.
* varpool.cc: Likewise.
* ssa-iterators.h: Likewise.
* asan.cc: Likewise.
* reload1.cc: Likewise.
* cfgloopanal.cc: Likewise.
* tree-vectorizer.cc: Likewise.
* simplify-rtx.cc: Likewise.
* opts-global.cc: Likewise.
* gimple-ssa-store-merging.cc: Likewise.
* expmed.cc: Likewise.
* tree-ssa-loop-prefetch.cc: Likewise.
* tree-ssa-dse.h: Likewise.
* tree-vect-stmts.cc: Likewise.
* gimple-fold.cc: Likewise.
* lra-coalesce.cc: Likewise.
* data-streamer-out.cc: Likewise.
* diagnostic.cc: Likewise.
* tree-ssa-alias.cc: Likewise.
* tree-vect-patterns.cc: Likewise.
* common/common-target.def: Likewise.
* common/config/rx/rx-common.cc: Likewise.
* common/config/msp430/msp430-common.cc: Likewise.
* common/config/avr/avr-common.cc: Likewise.
* common/config/i386/i386-common.cc: Likewise.
* common/config/pdp11/pdp11-common.cc: Likewise.
* common/config/rs6000/rs6000-common.cc: Likewise.
* common/config/mcore/mcore-common.cc: Likewise.
* graphite.cc: Likewise.
* gimple-low.cc: Likewise.
* genmodes.cc: Likewise.
* gimple-loop-jam.cc: Likewise.
* lto-streamer-out.cc: Likewise.
* predict.cc: Likewise.
* omp-expand.cc: Likewise.
* gimple-array-bounds.cc: Likewise.
* predict.def: Likewise.
* opts.h: Likewise.
* tree-stdarg.cc: Likewise.
* gimplify.cc: Likewise.
* ira-lives.cc: Likewise.
* loop-doloop.cc: Likewise.
* lra.cc: Likewise.
* gimple-iterator.h: Likewise.
* tree-sra.cc: Likewise.
gcc/fortran/
* trans-openmp.cc: Remove trailing whitespace.
* trans-common.cc: Likewise.
* match.h: Likewise.
* scanner.cc: Likewise.
* gfortranspec.cc: Likewise.
* io.cc: Likewise.
* iso-c-binding.def: Likewise.
* iso-fortran-env.def: Likewise.
* types.def: Likewise.
* openmp.cc: Likewise.
* f95-lang.cc: Likewise.
gcc/analyzer/
* state-purge.cc: Remove trailing whitespace.
* region-model.h: Likewise.
* region-model.cc: Likewise.
* program-point.cc: Likewise.
* exploded-graph.h: Likewise.
* program-state.cc: Likewise.
* supergraph.cc: Likewise.
gcc/c-family/
* c-ubsan.cc: Remove trailing whitespace.
* stub-objc.cc: Likewise.
* c-pragma.cc: Likewise.
* c-ppoutput.cc: Likewise.
* c-indentation.cc: Likewise.
* c-ada-spec.cc: Likewise.
* c-opts.cc: Likewise.
* c-common.cc: Likewise.
* c-format.cc: Likewise.
* c-omp.cc: Likewise.
* c-objc.h: Likewise.
* c-cppbuiltin.cc: Likewise.
* c-attribs.cc: Likewise.
* c-target.def: Likewise.
* c-common.h: Likewise.
gcc/c/
* c-typeck.cc: Remove trailing whitespace.
* gimple-parser.cc: Likewise.
* c-parser.cc: Likewise.
* c-decl.cc: Likewise.
gcc/cp/
* vtable-class-hierarchy.cc: Remove trailing whitespace.
* typeck2.cc: Likewise.
* decl.cc: Likewise.
* init.cc: Likewise.
* semantics.cc: Likewise.
* module.cc: Likewise.
* rtti.cc: Likewise.
* cxx-pretty-print.cc: Likewise.
* cvt.cc: Likewise.
* mangle.cc: Likewise.
* name-lookup.h: Likewise.
* coroutines.cc: Likewise.
* error.cc: Likewise.
* lambda.cc: Likewise.
* tree.cc: Likewise.
* g++spec.cc: Likewise.
* decl2.cc: Likewise.
* cp-tree.h: Likewise.
* parser.cc: Likewise.
* pt.cc: Likewise.
* call.cc: Likewise.
* lex.cc: Likewise.
* cp-lang.cc: Likewise.
* cp-tree.def: Likewise.
* constexpr.cc: Likewise.
* typeck.cc: Likewise.
* name-lookup.cc: Likewise.
* optimize.cc: Likewise.
* search.cc: Likewise.
* mapper-client.cc: Likewise.
* ptree.cc: Likewise.
* class.cc: Likewise.
gcc/jit/
* docs/examples/tut04-toyvm/toyvm.cc: Remove trailing whitespace.
gcc/lto/
* lto-object.cc: Remove trailing whitespace.
* lto-symtab.cc: Likewise.
* lto-partition.cc: Likewise.
* lang-specs.h: Likewise.
* lto-lang.cc: Likewise.
gcc/objc/
* objc-encoding.cc: Remove trailing whitespace.
* objc-map.h: Likewise.
* objc-next-runtime-abi-01.cc: Likewise.
* objc-act.cc: Likewise.
* objc-map.cc: Likewise.
gcc/objcp/
* objcp-decl.cc: Remove trailing whitespace.
* objcp-lang.cc: Likewise.
* objcp-decl.h: Likewise.
gcc/rust/
* util/optional.h: Remove trailing whitespace.
* util/expected.h: Likewise.
* util/rust-unicode-data.h: Likewise.
gcc/m2/
* mc-boot/GFpuIO.cc: Remove trailing whitespace.
* mc-boot/GFIO.cc: Likewise.
* mc-boot/GFormatStrings.cc: Likewise.
* mc-boot/GCmdArgs.cc: Likewise.
* mc-boot/GDebug.h: Likewise.
* mc-boot/GM2Dependent.cc: Likewise.
* mc-boot/GRTint.cc: Likewise.
* mc-boot/GDebug.cc: Likewise.
* mc-boot/GmcError.cc: Likewise.
* mc-boot/Gmcp4.cc: Likewise.
* mc-boot/GM2RTS.cc: Likewise.
* mc-boot/GIO.cc: Likewise.
* mc-boot/Gmcp5.cc: Likewise.
* mc-boot/GDynamicStrings.cc: Likewise.
* mc-boot/Gmcp1.cc: Likewise.
* mc-boot/GFormatStrings.h: Likewise.
* mc-boot/Gmcp2.cc: Likewise.
* mc-boot/Gmcp3.cc: Likewise.
* pge-boot/GFIO.cc: Likewise.
* pge-boot/GDebug.h: Likewise.
* pge-boot/GM2Dependent.cc: Likewise.
* pge-boot/GDebug.cc: Likewise.
* pge-boot/GM2RTS.cc: Likewise.
* pge-boot/GSymbolKey.cc: Likewise.
* pge-boot/GIO.cc: Likewise.
* pge-boot/GIndexing.cc: Likewise.
* pge-boot/GDynamicStrings.cc: Likewise.
* pge-boot/GFormatStrings.h: Likewise.
gcc/go/
* go-gcc.cc: Remove trailing whitespace.
* gospec.cc: Likewise.
|
|
Architecture-specific CFI directives are currently declared an processed
among others architecture-independent CFI directives in gcc/dwarf2* files.
This approach creates confusion, specifically in the case of DWARF
instructions in the vendor space and using the same instruction code.
Such a clash currently happen between DW_CFA_GNU_window_save (used on
SPARC) and DW_CFA_AARCH64_negate_ra_state (used on AArch64), and both
having the same instruction code 0x2d.
Then AArch64 compilers generates a SPARC CFI directive (.cfi_window_save)
instead of .cfi_negate_ra_state, contrarilly to what is expected in
[DWARF for the Arm 64-bit Architecture (AArch64)](https://github.com/
ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst).
This refactoring does not solve completely the problem, but improve the
situation by moving some of the processing of those directives (more
specifically their output in the assembly) to the backend via 2 target
hooks:
- DW_CFI_OPRND1_DESC: parse the first operand of the directive (if any).
- OUTPUT_CFI_DIRECTIVE: output the CFI directive as a string.
Additionally, this patch also contains a renaming of an enum used for
return address mangling on AArch64.
gcc/ChangeLog:
* config/aarch64/aarch64.cc
(aarch64_output_cfi_directive): New hook for CFI directives.
(aarch64_dw_cfi_oprnd1_desc): Same.
(TARGET_OUTPUT_CFI_DIRECTIVE): Hook for output_cfi_directive.
(TARGET_DW_CFI_OPRND1_DESC): Hook for dw_cfi_oprnd1_desc.
* config/sparc/sparc.cc
(sparc_output_cfi_directive): New hook for CFI directives.
(sparc_dw_cfi_oprnd1_desc): Same.
(TARGET_OUTPUT_CFI_DIRECTIVE): Hook for output_cfi_directive.
(TARGET_DW_CFI_OPRND1_DESC): Hook for dw_cfi_oprnd1_desc.
* coretypes.h
(struct dw_cfi_node): Forward declaration of CFI type from
gcc/dwarf2out.h.
(enum dw_cfi_oprnd_type): Same.
(enum dwarf_call_frame_info): Same.
* doc/tm.texi: Regenerated from doc/tm.texi.in.
* doc/tm.texi.in: Add doc for new target hooks.
type of enum to allow forward declaration.
* dwarf2cfi.cc
(struct dw_cfi_row): Update the description for window_save
and ra_mangled.
(dwarf2out_frame_debug_cfa_negate_ra_state): Use AArch64 CFI
directive instead of the SPARC one.
(change_cfi_row): Use the right CFI directive's name for RA
mangling.
(output_cfi): Remove explicit architecture-specific CFI
directive DW_CFA_GNU_window_save that falls into default case.
(output_cfi_directive): Use target hook as default.
* dwarf2out.cc (dw_cfi_oprnd1_desc): Use target hook as default.
* dwarf2out.h (enum dw_cfi_oprnd_type): specify underlying type
of enum to allow forward declaration.
(dw_cfi_oprnd1_desc): Call target hook.
(output_cfi_directive): Use dw_cfi_ref instead of struct
dw_cfi_node *.
* hooks.cc
(hook_bool_dwcfi_dwcfioprndtyperef_false): New.
(hook_bool_FILEptr_dwcfiptr_false): New.
* hooks.h
(hook_bool_dwcfi_dwcfioprndtyperef_false): New.
(hook_bool_FILEptr_dwcfiptr_false): New.
* target.def: Documentation for new hooks.
include/ChangeLog:
* dwarf2.h (enum dwarf_call_frame_info): specify underlying
libffi/ChangeLog:
* include/ffi_cfi.h (cfi_negate_ra_state): Declare AArch64 cfi
directive.
libgcc/ChangeLog:
* config/aarch64/aarch64-asm.h (PACIASP): Replace SPARC CFI
directive by AArch64 one.
(AUTIASP): Same.
libitm/ChangeLog:
* config/aarch64/sjlj.S: Replace SPARC CFI directive by
AArch64 one.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/pr94515-1.C: Replace SPARC CFI directive by
AArch64 one.
* g++.target/aarch64/pr94515-2.C: Same.
|
|
The following adds a target hook to specify whether regs of MODE can be
used to transfer bits. The hook is supposed to be used for value-numbering
to decide whether a value loaded in such mode can be punned to another
mode instead of re-loading the value in the other mode and for SRA to
decide whether MODE is suitable as container holding a value to be
used in different modes.
* target.def (mode_can_transfer_bits): New target hook.
* target.h (mode_can_transfer_bits): New function wrapping the
hook and providing default behavior.
* doc/tm.texi.in: Update.
* doc/tm.texi: Re-generate.
|
|
This adds a conditional store optimization for the vectorizer as a pattern.
The vectorizer already supports modifying memory accesses because of the pattern
based gather/scatter recognition.
Doing it in the vectorizer allows us to still keep the ability to vectorize such
loops for architectures that don't have MASK_STORE support, whereas doing this
in ifcvt makes us commit to MASK_STORE.
Concretely for this loop:
void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int stride)
{
if (stride <= 1)
return;
for (int i = 0; i < n; i++)
{
int res = c[i];
int t = b[i+stride];
if (a[i] != 0)
res = t;
c[i] = res;
}
}
today we generate:
.L3:
ld1b z29.s, p7/z, [x0, x5]
ld1w z31.s, p7/z, [x2, x5, lsl 2]
ld1w z30.s, p7/z, [x1, x5, lsl 2]
cmpne p15.b, p6/z, z29.b, #0
sel z30.s, p15, z30.s, z31.s
st1w z30.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any .L3
which in gimple is:
vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67);
vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
mask__34.16_79 = vect__9.15_77 != { 0, ... };
vect_res_11.17_80 = VEC_COND_EXPR <mask__34.16_79, vect_t_20.12_74, vect_res_18.9_68>;
.MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80);
A MASK_STORE is already conditional, so there's no need to perform the load of
the old values and the VEC_COND_EXPR. This patch makes it so we generate:
vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
mask__34.16_79 = vect__9.15_77 != { 0, ... };
.MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68);
which generates:
.L3:
ld1b z30.s, p7/z, [x0, x5]
ld1w z31.s, p7/z, [x1, x5, lsl 2]
cmpne p7.b, p7/z, z30.b, #0
st1w z31.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any .L3
gcc/ChangeLog:
PR tree-optimization/115531
* tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New.
(vect_recog_cond_store_pattern): New.
(vect_vect_recog_func_ptrs): Use it.
* target.def (conditional_operation_is_expensive): New.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document it.
* targhooks.cc (default_conditional_operation_is_expensive): New.
* targhooks.h (default_conditional_operation_is_expensive): New.
|
|
The `function_attribute_inlinable_p` hook documentation described it
returning the value if it is OK to inline the provided fndecl into "the
current function". AFAICS This hook is only called when
`current_function_decl` is the same as the `fndecl` argument that the
hook is given, hence asking whether `fndecl` can be inlined into "the
current function" doesn't seem relevant. Moreover from what I see no
existing implementation of `function_attribute_inlinable_p` uses "the
current function" in any way.
Update the documentation to match this understanding.
The `unspec_may_trap_p` documentation mentioned applying to either
`unspec` or `unspec_volatile`. AFAICS this hook is only used for
`unspec` codes since c84a808e493a, so I removed the mention of
`unspec_volatile`.
gcc/ChangeLog:
* doc/tm.texi: Regenerated.
* target.def (function_attribute_inlinable_p,
unspec_may_trap_p): Update documentation.
|
|
Currently how we determine which mode will be used for a
floating point type is that for a given type precision
(size) call mode_for_size to get the first mode which has
this size in the specified class. On Powerpc, we have
three modes (TF/KF/IF) having the same mode precision 128
(see[1]), so the processing forces us to have to place TF
at the first place, it would require us to make more
adjustment in some generic code to avoid some unexpected
mode conversions and it would be even worse if we get rid
of TF eventually one day. And as Joseph pointed out in [2],
"floating types should have their mode, not a poorly
defined precision value", as Joseph and Richi suggested,
this patch is to introduce one hook mode_for_floating_type
which returns the corresponding mode for type float, double
or long double. The default implementation returns SFmode
for float and DFmode for double or long double. For ports
which need special treatment, there are some other patches
for their own port specific implementation (referring to
how {,LONG_}DOUBLE_TYPE_SIZE get used there). For all
generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending
on the context, some of them are replaced with TYPE_PRECISION
of the according type node, some other are replaced with
GET_MODE_PRECISION on the mode from mode_for_floating_type.
This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE,
so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port
specific are removed, but there are still some which are
good to be kept for readability then they get renamed with
port specific prefix.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651017.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html
gcc/jit/ChangeLog:
* jit-recording.cc (recording::memento_of_get_type::get_size): Update
macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.
gcc/ChangeLog:
* coretypes.h (enum tree_index): Forward declaration.
* defaults.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* doc/rtl.texi: Update document by replacing {FLOAT,DOUBLE}_TYPE_SIZE
with C type {float,double}.
* doc/tm.texi.in: Document new hook mode_for_floating_type, remove
document entries for {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE and
update document for WIDEST_HARDWARE_FP_SIZE.
* doc/tm.texi: Regenerate.
* emit-rtl.cc (init_emit_once): Replace DOUBLE_TYPE_SIZE by
calling targetm.c.mode_for_floating_type with TI_DOUBLE_TYPE.
* real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Use TYPE_PRECISION of
long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE.
* system.h (FLOAT_TYPE_SIZE): Poison.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* target.def (mode_for_floating_type): New hook.
* targhooks.cc (default_mode_for_floating_type): New function.
(default_scalar_mode_supported_p): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE.
* targhooks.h (default_mode_for_floating_type): New declaration.
* tree-core.h (enum tree_index): Specify underlying type unsigned
to sync with forward declaration in coretypes.h.
(NUM_FLOATN_TYPES): Explicitly convert to int.
(NUM_FLOATNX_TYPES): Likewise.
(NUM_FLOATN_NX_TYPES): Likewise.
* tree.cc (build_common_tree_nodes): Update macros
{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling
targetm.c.mode_for_floating_type with
TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE and set type mode accordingly.
* config/arc/arc.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/bpf/bpf.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/epiphany/epiphany.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/fr30/fr30.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/frv/frv.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/ft32/ft32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/gcn/gcn.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/iq2000/iq2000.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/lm32/lm32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/m32c/m32c.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/m32r/m32r.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/microblaze/microblaze.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/mmix/mmix.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/moxie/moxie.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/msp430/msp430.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nds32/nds32.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nios2/nios2.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/nvptx/nvptx.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/or1k/or1k.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pdp11/pdp11.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pru/pru.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/stormy16/stormy16.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/visium/visium.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/xtensa/xtensa.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rs6000/rs6000.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(rs6000_c_mode_for_floating_type): New function.
* config/rs6000/rs6000.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/aarch64/aarch64.cc (aarch64_c_mode_for_floating_type):
New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/aarch64/aarch64.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/alpha/alpha.cc (alpha_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/alpha/alpha.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/avr/avr.cc (avr_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/avr/avr.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/i386/i386.cc (ix86_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/i386/i386.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/ia64/ia64.cc (ia64_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/ia64/ia64.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/riscv/riscv.cc (riscv_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/riscv/riscv.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rl78/rl78.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(rl78_c_mode_for_floating_type): New function.
* config/rl78/rl78.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/rx/rx.cc (rx_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/rx/rx.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/s390/s390.cc (s390_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/s390/s390.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/sh/sh.cc (sh_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/sh/sh.h (LONG_DOUBLE_TYPE_SIZE): Remove.
* config/h8300/h8300.cc (h8300_c_mode_for_floating_type): New
function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/h8300/h8300.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Remove.
(DOUBLE_TYPE_MODE): New macro.
* config/h8300/linux.h (DOUBLE_TYPE_SIZE): Remove.
(DOUBLE_TYPE_MODE): New macro.
* config/loongarch/loongarch.cc (loongarch_c_mode_for_floating_type):
New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/loongarch/loongarch.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(LA_LONG_DOUBLE_TYPE_SIZE): ... this.
(UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE with
LA_LONG_DOUBLE_TYPE_SIZE.
(MAX_FIXED_MODE_SIZE): Likewise.
(STRUCTURE_SIZE_BOUNDARY): Likewise.
(BIGGEST_ALIGNMENT): Likewise.
* config/m68k/m68k.cc (m68k_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/m68k/m68k.h (LONG_DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_MODE): New macro.
* config/m68k/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_MODE): New macro.
* config/mips/mips.cc (mips_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
* config/mips/mips.h (UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE
with MIPS_LONG_DOUBLE_TYPE_SIZE.
(MAX_FIXED_MODE_SIZE): Likewise.
(STRUCTURE_SIZE_BOUNDARY): Likewise.
(BIGGEST_ALIGNMENT): Likewise.
(FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/mips/n32-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/pa/pa.cc (pa_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(pa_scalar_mode_supported_p): Rename FLOAT_TYPE_SIZE to
PA_FLOAT_TYPE_SIZE, rename DOUBLE_TYPE_SIZE to PA_DOUBLE_TYPE_SIZE
and rename LONG_DOUBLE_TYPE_SIZE to PA_LONG_DOUBLE_TYPE_SIZE.
* config/pa/pa.h (PA_FLOAT_TYPE_SIZE): New macro.
(PA_DOUBLE_TYPE_SIZE): Likewise.
(PA_LONG_DOUBLE_TYPE_SIZE): Likewise.
* config/pa/pa-64.h (FLOAT_TYPE_SIZE): Rename to ...
(PA_FLOAT_TYPE_SIZE): ... this.
(DOUBLE_TYPE_SIZE): Rename to ...
(PA_DOUBLE_TYPE_SIZE): ... this.
(LONG_DOUBLE_TYPE_SIZE): Rename to ...
(PA_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/pa/pa-hpux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(PA_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sparc.cc (sparc_c_mode_for_floating_type): New function.
(TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
(FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Likewise.
(LONG_DOUBLE_TYPE_SIZE): Likewise.
(sparc_type_code): Replace FLOAT_TYPE_SIZE with TYPE_PRECISION of
float_type_node.
* config/sparc/sparc.h (FLOAT_TYPE_SIZE): Remove.
(DOUBLE_TYPE_SIZE): Remove.
* config/sparc/freebsd.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/linux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/linux64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/openbsd64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sol2.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sp-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/sparc/sp64-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
(SPARC_LONG_DOUBLE_TYPE_SIZE): ... this.
* config/bfin/bfin.h (FLOAT_TYPE_SIZE): Rename to ...
(BFIN_FLOAT_TYPE_SIZE): ... this.
(DOUBLE_TYPE_SIZE): Rename to ...
(BFIN_DOUBLE_TYPE_SIZE): ... this.
(LONG_DOUBLE_TYPE_SIZE): Remove.
(UNITS_PER_FLOAT): Replace FLOAT_TYPE_SIZE with BFIN_FLOAT_TYPE_SIZE.
(UNITS_PER_DOUBLE): Replace DOUBLE_TYPE_SIZE with
BFIN_DOUBLE_TYPE_SIZE.
|
|
In cfgexpand, there is an optimization for branch which tests
targetm.gen_ccmp_first == NULL. However for target like x86-64, the
hook was implemented but it does not indicate that ccmp was enabled.
Add a new target hook TARGET_HAVE_CCMP and replace the middle-end
check for the existance of gen_ccmp_first to avoid misoptimization.
gcc/ChangeLog:
PR target/115370
PR target/115463
* target.def (have_ccmp): New target hook.
* targhooks.cc (default_have_ccmp): New function.
* targhooks.h (default_have_ccmp): New prototype.
* doc/tm.texi.in: Add TARGET_HAVE_CCMP.
* doc/tm.texi: Regenerate.
* cfgexpand.cc (expand_gimple_cond): Call targetm.have_ccmp
instead of checking if targetm.gen_ccmp_first exists.
* expr.cc (expand_expr_real_gassign): Likewise.
* config/i386/i386.cc (ix86_have_ccmp): New target hook to
check if APX_CCMP enabled.
(TARGET_HAVE_CCMP): Define.
|
|
This patch introduces infrastructure for targets to add an offset to
the label issued after the call_insn to set the call_return_pc
attribute. This will be used on rs6000, that sometimes issues another
instruction after the call proper as part of a call insn.
for gcc/ChangeLog
* target.def (call_offset_return_label): New hook.
* doc/tm.texi.in (TARGET_CALL_OFFSET_RETURN_LABEL): Add
placeholder.
* doc/tm.texi: Rebuild.
* dwarf2out.cc (struct call_arg_loc_node): Record call_insn
instead of call_arg_loc_note.
(add_AT_lbl_id): Add optional offset argument.
(gen_call_site_die): Compute and pass on a return pc offset.
(gen_subprogram_die): Move call_arg_loc_note computation...
(dwarf2out_var_location): ... from here. Set call_insn.
|
|
on mingw ia32 [PR114968]
__cxa_atexit/__cxa_thread_atexit/__cxa_throw functions accept function
pointers to usually directly destructors rather than wrappers around
them.
Now, mingw ia32 uses implicitly __attribute__((thiscall)) calling
conventions for METHOD_TYPE (where the this pointer is passed in %ecx
register, the rest on the stack), so these functions use:
in config/os/mingw32/os_defines.h:
#if defined (__i386__)
#define _GLIBCXX_CDTOR_CALLABI __thiscall
#endif
in libsupc++/cxxabi.h
__cxa_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void*) _GLIBCXX_NOTHROW;
__cxa_thread_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void *) _GLIBCXX_NOTHROW;
__cxa_throw(void*, std::type_info*, void (_GLIBCXX_CDTOR_CALLABI *) (void *))
__attribute__((__noreturn__));
Now, mingw for some weird reason uses
#define TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT hook_bool_void_true
so it never actually uses __cxa_atexit, but does use __cxa_thread_atexit
and __cxa_throw. Recent changes for modules result in more detailed
__cxa_*atexit/__cxa_throw prototypes precreated by the compiler, and if
that happens and one also includes <cxxabi.h>, the compiler complains about
mismatches in the prototypes.
One thing is the missing thiscall attribute on the FUNCTION_TYPE, the
other problem is that all of atexit/__cxa_atexit/__cxa_thread_atexit
get function pointer types created by a single function,
get_atexit_fn_ptr_type (), which creates it depending on if atexit
or __cxa_atexit will be used as either void(*)(void) or void(*)(void *),
but when using atexit and __cxa_thread_atexit it uses the wrong function
type for __cxa_thread_atexit.
The following patch adds a target hook to add the thiscall attribute to the
function pointers, and splits the get_atexit_fn_ptr_type () function into
get_atexit_fn_ptr_type () and get_cxa_atexit_fn_ptr_type (), the former always
creates shared void(*)(void) type, the latter creates either
void(*)(void*) (on most targets) or void(__attribute__((thiscall))*)(void*)
(on mingw ia32). So that we don't waiste another GTY global tree for it,
because cleanup_type used for the same purpose for __cxa_throw should be
the same, the code changes it to use that type too.
In register_dtor_fn then based on the decision whether to use atexit,
__cxa_atexit or __cxa_thread_atexit it picks the right function pointer
type, and also if it decides to emit a __tcf_* wrapper for the cleanup,
uses that type for that wrapper so that it agrees on calling convention.
2024-05-10 Jakub Jelinek <jakub@redhat.com>
PR target/114968
gcc/
* target.def (use_atexit_for_cxa_atexit): Remove spurious space
from comment.
(adjust_cdtor_callabi_fntype): New cxx target hook.
* targhooks.h (default_cxx_adjust_cdtor_callabi_fntype): Declare.
* targhooks.cc (default_cxx_adjust_cdtor_callabi_fntype): New
function.
* doc/tm.texi.in (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Add.
* doc/tm.texi: Regenerate.
* config/i386/i386.cc (ix86_cxx_adjust_cdtor_callabi_fntype): New
function.
(TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Redefine.
gcc/cp/
* cp-tree.h (atexit_fn_ptr_type_node, cleanup_type): Adjust macro
comments.
(get_cxa_atexit_fn_ptr_type): Declare.
* decl.cc (get_atexit_fn_ptr_type): Adjust function comment, only
build type for atexit argument.
(get_cxa_atexit_fn_ptr_type): New function.
(get_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than
get_atexit_fn_ptr_type when using __cxa_atexit.
(get_thread_atexit_node): Call get_cxa_atexit_fn_ptr_type
rather than get_atexit_fn_ptr_type.
(start_cleanup_fn): Add ob_parm argument, call
get_cxa_atexit_fn_ptr_type or get_atexit_fn_ptr_type depending
on it and create PARM_DECL also based on that argument.
(register_dtor_fn): Adjust start_cleanup_fn caller, use
get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type
for use_dtor casts.
* except.cc (build_throw): Use get_cxa_atexit_fn_ptr_type ().
|
|
|
|
This patch adds support for the "target_version" attribute to the middle
end and the C++ frontend, which will be used to implement function
multiversioning in the aarch64 backend.
On targets that don't use the "target" attribute for multiversioning,
there is no conflict between the "target" and "target_clones"
attributes. This patch therefore makes the mutual exclusion in
C-family, D and Ada conditonal upon the value of the
expanded_clones_attribute target hook.
The "target_version" attribute is only added to C++ in this patch,
because this is currently the only frontend which supports
multiversioning using the "target" attribute. Support for the
"target_version" attribute will be extended to C at a later date.
Targets that currently use the "target" attribute for function
multiversioning (i.e. i386 and rs6000) are not affected by this patch.
gcc/ChangeLog:
* attribs.cc (decl_attributes): Pass attribute name to target.
(is_function_default_version): Update comment to specify
incompatibility with target_version attributes.
* cgraphclones.cc (cgraph_node::create_version_clone_with_body):
Call valid_version_attribute_p for target_version attributes.
* defaults.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE): New macro.
* target.def (valid_version_attribute_p): New hook.
* doc/tm.texi.in: Add new hook.
* doc/tm.texi: Regenerate.
* multiple_target.cc (create_dispatcher_calls): Remove redundant
is_function_default_version check.
(expand_target_clones): Use target macro to pick attribute name.
* targhooks.cc (default_target_option_valid_version_attribute_p):
New.
* targhooks.h (default_target_option_valid_version_attribute_p):
New.
* tree.h (DECL_FUNCTION_VERSIONED): Update comment to include
target_version attributes.
gcc/c-family/ChangeLog:
* c-attribs.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto, and add target_version.
(attr_target_version_exclusions): New.
(c_common_attribute_table): Add target_version.
(handle_target_version_attribute): New.
(handle_target_attribute): Amend comment.
(handle_target_clones_attribute): Ditto.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto.
gcc/d/ChangeLog:
* d-attribs.cc (attr_target_exclusions): Make
target/target_clones exclusion target-dependent.
(attr_target_clones_exclusions): Ditto.
gcc/cp/ChangeLog:
* decl2.cc (check_classfn): Update comment to include
target_version attributes.
|
|
Given what I saw in the aarch64/arm psABIs for BITINT_TYPE, as I said
earlier I'm afraid we need to differentiate between the limb mode/precision
specified in the psABIs (what is used to decide how it is actually passed,
aligned or what size it has) vs. what limb mode/precision should be used
during bitint lowering and in the libgcc bitint APIs.
While in the x86_64 psABI a limb is 64-bit, which is perfect for both,
that is a wordsize which we can perform operations natively in,
e.g. aarch64 wants 128-bit limbs for alignment/sizing purposes, but
on the bitint lowering side I believe it would result in terribly bad code
and on the libgcc side wouldn't work at all (because it relies there on
longlong.h support).
So, the following patch makes it possible for aarch64 to use TImode
as abi_limb_mode for _BitInt(129) and larger, while using DImode as
limb_mode.
2023-12-15 Jakub Jelinek <jakub@redhat.com>
* target.h (struct bitint_info): Add abi_limb_mode member, adjust
comment.
* target.def (bitint_type_info): Mention abi_limb_mode instead of
limb_mode.
* varasm.cc (output_constant): Use abi_limb_mode rather than
limb_mode.
* stor-layout.cc (finish_bitfield_representative): Likewise. Assert
that if precision is smaller or equal to abi_limb_mode precision or
if info.big_endian is different from WORDS_BIG_ENDIAN, info.limb_mode
must be the same as info.abi_limb_mode.
(layout_type): Use abi_limb_mode rather than limb_mode.
* gimple-fold.cc (clear_padding_bitint_needs_padding_p): Likewise.
(clear_padding_type): Likewise.
* config/i386/i386.cc (ix86_bitint_type_info): Also set
info->abi_limb_mode.
* doc/tm.texi: Regenerated.
|
|
Targets that don't expose callee stacks to callers, such as nvptx, as
well as -fsplit-stack compilations, violate fundamental assumptions of
the current strub implementation. This patch enables targets to
disable strub, and disables it when -fsplit-stack is enabled.
When strub support is disabled, the testsuite will now skip strub
tests, and libgcc will not build the strub runtime components.
for gcc/ChangeLog
* target.def (have_strub_support_for): New hook.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Rebuild.
* ipa-strub.cc: Include target.h.
(strub_target_support_p): New.
(can_strub_p): Call it. Test for no flag_split_stack.
(pass_ipa_strub::adjust_at_calls_call): Check for target
support.
* config/nvptx/nvptx.cc (TARGET_HAVE_STRUB_SUPPORT_FOR):
Disable.
* doc/sourcebuild.texi (strub): Document new effective
target.
for gcc/testsuite/ChangeLog
* c-c++-common/strub-split-stack.c: New.
* c-c++-common/strub-unsupported.c: New.
* c-c++-common/strub-unsupported-2.c: New.
* c-c++-common/strub-unsupported-3.c: New.
* lib/target-supports.exp (check_effective_target_strub): New.
* c-c++-common/strub-O0.c: Require effective target strub.
* c-c++-common/strub-O1.c: Likewise.
* c-c++-common/strub-O2.c: Likewise.
* c-c++-common/strub-O2fni.c: Likewise.
* c-c++-common/strub-O3.c: Likewise.
* c-c++-common/strub-O3fni.c: Likewise.
* c-c++-common/strub-Og.c: Likewise.
* c-c++-common/strub-Os.c: Likewise.
* c-c++-common/strub-all1.c: Likewise.
* c-c++-common/strub-all2.c: Likewise.
* c-c++-common/strub-apply1.c: Likewise.
* c-c++-common/strub-apply2.c: Likewise.
* c-c++-common/strub-apply3.c: Likewise.
* c-c++-common/strub-apply4.c: Likewise.
* c-c++-common/strub-at-calls1.c: Likewise.
* c-c++-common/strub-at-calls2.c: Likewise.
* c-c++-common/strub-defer-O1.c: Likewise.
* c-c++-common/strub-defer-O2.c: Likewise.
* c-c++-common/strub-defer-O3.c: Likewise.
* c-c++-common/strub-defer-Os.c: Likewise.
* c-c++-common/strub-internal1.c: Likewise.
* c-c++-common/strub-internal2.c: Likewise.
* c-c++-common/strub-parms1.c: Likewise.
* c-c++-common/strub-parms2.c: Likewise.
* c-c++-common/strub-parms3.c: Likewise.
* c-c++-common/strub-relaxed1.c: Likewise.
* c-c++-common/strub-relaxed2.c: Likewise.
* c-c++-common/strub-short-O0-exc.c: Likewise.
* c-c++-common/strub-short-O0.c: Likewise.
* c-c++-common/strub-short-O1.c: Likewise.
* c-c++-common/strub-short-O2.c: Likewise.
* c-c++-common/strub-short-O3.c: Likewise.
* c-c++-common/strub-short-Os.c: Likewise.
* c-c++-common/strub-strict1.c: Likewise.
* c-c++-common/strub-strict2.c: Likewise.
* c-c++-common/strub-tail-O1.c: Likewise.
* c-c++-common/strub-tail-O2.c: Likewise.
* c-c++-common/strub-var1.c: Likewise.
* c-c++-common/torture/strub-callable1.c: Likewise.
* c-c++-common/torture/strub-callable2.c: Likewise.
* c-c++-common/torture/strub-const1.c: Likewise.
* c-c++-common/torture/strub-const2.c: Likewise.
* c-c++-common/torture/strub-const3.c: Likewise.
* c-c++-common/torture/strub-const4.c: Likewise.
* c-c++-common/torture/strub-data1.c: Likewise.
* c-c++-common/torture/strub-data2.c: Likewise.
* c-c++-common/torture/strub-data3.c: Likewise.
* c-c++-common/torture/strub-data4.c: Likewise.
* c-c++-common/torture/strub-data5.c: Likewise.
* c-c++-common/torture/strub-indcall1.c: Likewise.
* c-c++-common/torture/strub-indcall2.c: Likewise.
* c-c++-common/torture/strub-indcall3.c: Likewise.
* c-c++-common/torture/strub-inlinable1.c: Likewise.
* c-c++-common/torture/strub-inlinable2.c: Likewise.
* c-c++-common/torture/strub-ptrfn1.c: Likewise.
* c-c++-common/torture/strub-ptrfn2.c: Likewise.
* c-c++-common/torture/strub-ptrfn3.c: Likewise.
* c-c++-common/torture/strub-ptrfn4.c: Likewise.
* c-c++-common/torture/strub-pure1.c: Likewise.
* c-c++-common/torture/strub-pure2.c: Likewise.
* c-c++-common/torture/strub-pure3.c: Likewise.
* c-c++-common/torture/strub-pure4.c: Likewise.
* c-c++-common/torture/strub-run1.c: Likewise.
* c-c++-common/torture/strub-run2.c: Likewise.
* c-c++-common/torture/strub-run3.c: Likewise.
* c-c++-common/torture/strub-run4.c: Likewise.
* c-c++-common/torture/strub-run4c.c: Likewise.
* c-c++-common/torture/strub-run4d.c: Likewise.
* c-c++-common/torture/strub-run4i.c: Likewise.
* g++.dg/strub-run1.C: Likewise.
* g++.dg/torture/strub-init1.C: Likewise.
* g++.dg/torture/strub-init2.C: Likewise.
* g++.dg/torture/strub-init3.C: Likewise.
* gnat.dg/strub_attr.adb: Likewise.
* gnat.dg/strub_ind.adb: Likewise.
* gnat.dg/strub_access.adb: Likewise.
* gnat.dg/strub_access1.adb: Likewise.
* gnat.dg/strub_disp.adb: Likewise.
* gnat.dg/strub_disp1.adb: Likewise.
* gnat.dg/strub_ind1.adb: Likewise.
* gnat.dg/strub_ind2.adb: Likewise.
* gnat.dg/strub_intf.adb: Likewise.
* gnat.dg/strub_intf1.adb: Likewise.
* gnat.dg/strub_intf2.adb: Likewise.
* gnat.dg/strub_renm.adb: Likewise.
* gnat.dg/strub_renm1.adb: Likewise.
* gnat.dg/strub_renm2.adb: Likewise.
* gnat.dg/strub_var.adb: Likewise.
* gnat.dg/strub_var1.adb: Likewise.
for libgcc/ChangeLog
* configure.ac: Check for strub support.
* configure: Rebuilt.
* Makefile.in: Compile strub.c conditionally.
|
|
GCC 5 and earlier applied array-to-pointer decay too early,
which affected the new attribute namespace code. A reduced
example of the construct that the attribute code uses is:
struct S { template<__SIZE_TYPE__ N> S(int (&)[N]); };
struct T { int a; S b; };
int a[] = { 1 };
T t = { 1, a };
This was fixed by f85e1317f8ea933f5c615680353bd646f480f7d3
(PR 16333 et al).
This patch tries to add a minimally-invasive workaround.
gcc/ada/
* gcc-interface/utils.cc (gnat_internal_attribute_table): Add extra
braces to work around PR 16333 in older compilers.
gcc/
* attribs.cc (handle_ignored_attributes_option): Add extra
braces to work around PR 16333 in older compilers.
* config/aarch64/aarch64.cc (aarch64_gnu_attribute_table): Likewise.
(aarch64_arm_attribute_table): Likewise.
* config/arm/arm.cc (arm_gnu_attribute_table): Likewise.
* config/i386/i386-options.cc (ix86_gnu_attribute_table): Likewise.
* config/ia64/ia64.cc (ia64_gnu_attribute_table): Likewise.
* config/rs6000/rs6000.cc (rs6000_gnu_attribute_table): Likewise.
* target-def.h (TARGET_GNU_ATTRIBUTES): Likewise.
* genhooks.cc (emit_init_macros): Likewise, when emitting the
instantiation of TARGET_ATTRIBUTE_TABLE.
* langhooks-def.h (LANG_HOOKS_INITIALIZER): Likewise, when
instantiating LANG_HOOKS_ATTRIBUTE_TABLE.
(LANG_HOOKS_ATTRIBUTE_TABLE): Define to be empty by default.
* target.def (attribute_table): Likewise.
gcc/c-family/
* c-attribs.cc (c_common_gnu_attribute_table): Add extra
braces to work around PR 16333 in older compilers.
gcc/c/
* c-decl.cc (std_attribute_table): Add extra braces to work
around PR 16333 in older compilers.
gcc/cp/
* tree.cc (cxx_gnu_attribute_table): Add extra braces to work
around PR 16333 in older compilers.
gcc/d/
* d-attribs.cc (d_langhook_common_attribute_table): Add extra braces
to work around PR 16333 in older compilers.
(d_langhook_gnu_attribute_table): Likewise.
gcc/fortran/
* f95-lang.cc (gfc_gnu_attribute_table): Add extra braces to work
around PR 16333 in older compilers.
gcc/jit/
* dummy-frontend.cc (jit_gnu_attribute_table): Add extra braces
to work around PR 16333 in older compilers.
(jit_format_attribute_table): Likewise.
gcc/lto/
* lto-lang.cc (lto_gnu_attribute_table): Add extra braces to work
around PR 16333 in older compilers.
(lto_format_attribute_table): Likewise.
|
|
Arm's SME has an array called ZA that for inline asm purposes
is effectively a form of special-purpose memory. It doesn't
have an associated storage type and so can't be passed and
returned in normal C/C++ objects.
We'd therefore like "za" in a clobber list to mean that an inline
asm can read from and write to ZA. (Just reading or writing
individually is unlikely to be useful, but we could add syntax
for that too if necessary.)
There is currently a TARGET_MD_ASM_ADJUST target hook that allows
targets to add clobbers to an asm instruction. This patch
extends that to allow targets to add USEs as well.
gcc/
* target.def (md_asm_adjust): Add a uses parameter.
* doc/tm.texi: Regenerate.
* cfgexpand.cc (expand_asm_loc): Update call to md_asm_adjust.
Handle any USEs created by the target.
(expand_asm_stmt): Likewise.
* recog.cc (asm_noperands): Handle asms with USEs.
(decode_asm_operands): Likewise.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add uses
parameter.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm.cc (thumb1_md_asm_adjust): Likewise.
* config/avr/avr.cc (avr_md_asm_adjust): Likewise.
* config/cris/cris.cc (cris_md_asm_adjust): Likewise.
* config/i386/i386.cc (ix86_md_asm_adjust): Likewise.
* config/mn10300/mn10300.cc (mn10300_md_asm_adjust): Likewise.
* config/nds32/nds32.cc (nds32_md_asm_adjust): Likewise.
* config/pdp11/pdp11.cc (pdp11_md_asm_adjust): Likewise.
* config/rs6000/rs6000.cc (rs6000_md_asm_adjust): Likewise.
* config/s390/s390.cc (s390_md_asm_adjust): Likewise.
* config/vax/vax.cc (vax_md_asm_adjust): Likewise.
* config/visium/visium.cc (visium_md_asm_adjust): Likewise.
|
|
We have the following two hooks into the call expansion code:
- TARGET_CALL_ARGS is called for each argument before arguments
are moved into hard registers.
- TARGET_END_CALL_ARGS is called after the end of the call
sequence (specifically, after any return value has been
moved to a pseudo).
This patch adds a TARGET_START_CALL_ARGS hook that is called before
the TARGET_CALL_ARGS sequence. This means that TARGET_START_CALL_REGS
and TARGET_END_CALL_REGS bracket the region in which argument registers
might be live. They also bracket a region in which the only call
emiitted by target-independent code is the call to the target function
itself. (For example, TARGET_START_CALL_ARGS happens after any use of
memcpy to copy arguments, and TARGET_END_CALL_ARGS happens before any
use of memcpy to copy the result.)
Also, the patch adds the cumulative argument structure as an argument
to the hooks, so that the target can use it to record and retrieve
information about the call as a whole.
The TARGET_CALL_ARGS docs said:
While generating RTL for a function call, this target hook is invoked once
for each argument passed to the function, either a register returned by
``TARGET_FUNCTION_ARG`` or a memory location. It is called just
- before the point where argument registers are stored.
The last bit was true for normal calls, but for libcalls the hook was
invoked earlier, before stack arguments have been copied. I don't think
this caused a practical difference for nvptx (the only port to use the
hooks) since I wouldn't expect any libcalls to take stack parameters.
gcc/
* doc/tm.texi.in: Add TARGET_START_CALL_ARGS.
* doc/tm.texi: Regenerate.
* target.def (start_call_args): New hook.
(call_args, end_call_args): Add a parameter for the cumulative
argument information.
* hooks.h (hook_void_rtx_tree): Delete.
* hooks.cc (hook_void_rtx_tree): Likewise.
* targhooks.h (hook_void_CUMULATIVE_ARGS): Declare.
(hook_void_CUMULATIVE_ARGS_rtx_tree): Likewise.
* targhooks.cc (hook_void_CUMULATIVE_ARGS): New function.
(hook_void_CUMULATIVE_ARGS_rtx_tree): Likewise.
* calls.cc (expand_call): Call start_call_args before computing
and storing stack parameters. Pass the cumulative argument
information to call_args and end_call_args.
(emit_library_call_value_1): Likewise.
* config/nvptx/nvptx.cc (nvptx_call_args): Add a cumulative
argument parameter.
(nvptx_end_call_args): Likewise.
|
|
Epilogues for sibling calls are generated using the
sibcall_epilogue pattern. One disadvantage of this approach
is that the target doesn't know which call the epilogue is for,
even though the code that generates the pattern has the call
to hand.
Although call instructions are currently rtxes, and so could be
passed as an operand to the pattern, the main point of introducing
rtx_insn was to move towards separating the rtx and insn types
(a good thing IMO). There also isn't an existing practice of
passing genuine instructions (as opposed to labels) to
instruction patterns.
This patch therefore adds a hook that can be defined as an
alternative to sibcall_epilogue. The advantage is that it
can be passed the call; the disadvantage is that it can't
use .md conveniences like generating instructions from
textual patterns (although most epilogues are too complex
to benefit much from that anyway).
gcc/
* doc/tm.texi.in: Add TARGET_EMIT_EPILOGUE_FOR_SIBCALL.
* doc/tm.texi: Regenerate.
* target.def (emit_epilogue_for_sibcall): New hook.
* calls.cc (can_implement_as_sibling_call_p): Use it.
* function.cc (thread_prologue_and_epilogue_insns): Likewise.
(reposition_prologue_and_epilogue_notes): Likewise.
* config/aarch64/aarch64-protos.h (aarch64_expand_epilogue): Take
an rtx_call_insn * rather than a bool.
* config/aarch64/aarch64.cc (aarch64_expand_epilogue): Likewise.
(TARGET_EMIT_EPILOGUE_FOR_SIBCALL): Define.
* config/aarch64/aarch64.md (epilogue): Update call.
(sibcall_epilogue): Delete.
|
|
Arm's SME adds a new processor mode called streaming mode.
This mode enables some new (matrix-oriented) instructions and
disables several existing groups of instructions, such as most
Advanced SIMD vector instructions and a much smaller set of SVE
instructions. It can also change the current vector length.
There are instructions to switch in and out of streaming mode.
However, their effect on the ISA and vector length can't be represented
directly in RTL, so they need to be emitted late in the pass pipeline,
close to md_reorg.
It's sometimes the responsibility of the prologue and epilogue to
switch modes, which means we need to emit the prologue and epilogue
sequences late as well. (This loses shrink-wrapping and scheduling
opportunities, but that's a price worth paying.)
This patch therefore adds a target hook for forcing prologue
and epilogue insertion to happen later in the pipeline.
gcc/
* target.def (use_late_prologue_epilogue): New hook.
* doc/tm.texi.in: Add TARGET_USE_LATE_PROLOGUE_EPILOGUE.
* doc/tm.texi: Regenerate.
* passes.def (pass_late_thread_prologue_and_epilogue): New pass.
* tree-pass.h (make_pass_late_thread_prologue_and_epilogue): Declare.
* function.cc (pass_thread_prologue_and_epilogue::gate): New function.
(pass_data_late_thread_prologue_and_epilogue): New pass variable.
(pass_late_thread_prologue_and_epilogue): New pass class.
(make_pass_late_thread_prologue_and_epilogue): New function.
|
|
Currently there are four static sources of attributes:
- LANG_HOOKS_ATTRIBUTE_TABLE
- LANG_HOOKS_COMMON_ATTRIBUTE_TABLE
- LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE
- TARGET_ATTRIBUTE_TABLE
All of the attributes in these tables go in the "gnu" namespace.
This means that they can use the traditional GNU __attribute__((...))
syntax and the standard [[gnu::...]] syntax.
Standard attributes are registered dynamically with a null namespace.
There are no supported attributes in other namespaces (clang, vendor
namespaces, etc.).
This patch tries to generalise things by making the namespace
part of the attribute specification.
It's usual for multiple attributes to be defined in the same namespace,
so rather than adding the namespace to each individual definition,
it seemed better to group attributes in the same namespace together.
This would also allow us to reuse the same table for clang attributes
that are written with the GNU syntax, or other similar situations
where the attribute can be accessed via multiple "spellings".
The patch therefore adds a scoped_attribute_specs that contains
a namespace and a list of attributes in that namespace.
It's still possible to have multiple scoped_attribute_specs
for the same namespace. E.g. it makes sense to keep the
C++-specific, C/C++-common, and format-related attributes in
separate tables, even though they're all GNU attributes.
Current lists of attributes are terminated by a null name.
Rather than keep that for the new structure, it seemed neater
to use an array_slice. This also makes the tables slighly more
compact.
In general, a target might want to support attributes in multiple
namespaces. Rather than have a separate hook for each possibility
(like the three langhooks above), it seemed better to make
TARGET_ATTRIBUTE_TABLE a table of tables. Specifically, it's
an array_slice of scoped_attribute_specs.
We can do the same thing for langhooks, which allows the three hooks
above to be merged into a single LANG_HOOKS_ATTRIBUTE_TABLE.
It also allows the standard attributes to be registered statically
and checked by the usual attribs.cc checks.
The patch adds a TARGET_GNU_ATTRIBUTES helper for the common case
in which a target wants a single table of gnu attributes. It can
only be used if the table is free of preprocessor directives.
There are probably other things we need to do to make vendor namespaces
work smoothly. E.g. in principle it would be good to make exclusion
sets namespace-aware. But to some extent we have that with standard
vs. gnu attributes too. This patch is just supposed to be a first step.
gcc/
* attribs.h (scoped_attribute_specs): New structure.
(register_scoped_attributes): Take a reference to a
scoped_attribute_specs instead of separate namespace and array
parameters.
* plugin.h (register_scoped_attributes): Likewise.
* attribs.cc (register_scoped_attributes): Likewise.
(attribute_tables): Change into an array of scoped_attribute_specs
pointers. Reduce to 1 element for frontends and 1 element for targets.
(empty_attribute_table): Delete.
(check_attribute_tables): Update for changes to attribute_tables.
Use a hash_set to identify duplicates.
(handle_ignored_attributes_option): Update for above changes.
(init_attributes): Likewise.
(excl_pair): Delete.
(test_attribute_exclusions): Update for above changes. Don't
enforce symmetry for standard attributes in the top-level namespace.
* langhooks-def.h (LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE): Likewise.
(LANG_HOOKS_INITIALIZER): Update accordingly.
(LANG_HOOKS_ATTRIBUTE_TABLE): Define to an empty constructor.
* langhooks.h (lang_hooks::common_attribute_table): Delete.
(lang_hooks::format_attribute_table): Likewise.
(lang_hooks::attribute_table): Redefine to an array of
scoped_attribute_specs pointers.
* target-def.h (TARGET_GNU_ATTRIBUTES): New macro.
* target.def (attribute_spec): Redefine to return an array of
scoped_attribute_specs pointers.
* tree-inline.cc (function_attribute_inlinable_p): Update accordingly.
* doc/tm.texi: Regenerate.
* config/aarch64/aarch64.cc (aarch64_attribute_table): Define using
TARGET_GNU_ATTRIBUTES.
* config/alpha/alpha.cc (vms_attribute_table): Likewise.
* config/avr/avr.cc (avr_attribute_table): Likewise.
* config/bfin/bfin.cc (bfin_attribute_table): Likewise.
* config/bpf/bpf.cc (bpf_attribute_table): Likewise.
* config/csky/csky.cc (csky_attribute_table): Likewise.
* config/epiphany/epiphany.cc (epiphany_attribute_table): Likewise.
* config/gcn/gcn.cc (gcn_attribute_table): Likewise.
* config/h8300/h8300.cc (h8300_attribute_table): Likewise.
* config/loongarch/loongarch.cc (loongarch_attribute_table): Likewise.
* config/m32c/m32c.cc (m32c_attribute_table): Likewise.
* config/m32r/m32r.cc (m32r_attribute_table): Likewise.
* config/m68k/m68k.cc (m68k_attribute_table): Likewise.
* config/mcore/mcore.cc (mcore_attribute_table): Likewise.
* config/microblaze/microblaze.cc (microblaze_attribute_table):
Likewise.
* config/mips/mips.cc (mips_attribute_table): Likewise.
* config/msp430/msp430.cc (msp430_attribute_table): Likewise.
* config/nds32/nds32.cc (nds32_attribute_table): Likewise.
* config/nvptx/nvptx.cc (nvptx_attribute_table): Likewise.
* config/riscv/riscv.cc (riscv_attribute_table): Likewise.
* config/rl78/rl78.cc (rl78_attribute_table): Likewise.
* config/rx/rx.cc (rx_attribute_table): Likewise.
* config/s390/s390.cc (s390_attribute_table): Likewise.
* config/sh/sh.cc (sh_attribute_table): Likewise.
* config/sparc/sparc.cc (sparc_attribute_table): Likewise.
* config/stormy16/stormy16.cc (xstormy16_attribute_table): Likewise.
* config/v850/v850.cc (v850_attribute_table): Likewise.
* config/visium/visium.cc (visium_attribute_table): Likewise.
* config/arc/arc.cc (arc_attribute_table): Likewise. Move further
down file.
* config/arm/arm.cc (arm_attribute_table): Update for above changes,
using...
(arm_gnu_attributes, arm_gnu_attribute_table): ...these new globals.
* config/i386/i386-options.h (ix86_attribute_table): Delete.
(ix86_gnu_attribute_table): Declare.
* config/i386/i386-options.cc (ix86_attribute_table): Replace with...
(ix86_gnu_attributes, ix86_gnu_attribute_table): ...these two globals.
* config/i386/i386.cc (ix86_attribute_table): Define as an array of
scoped_attribute_specs pointers.
* config/ia64/ia64.cc (ia64_attribute_table): Update for above changes,
using...
(ia64_gnu_attributes, ia64_gnu_attribute_table): ...these new globals.
* config/rs6000/rs6000.cc (rs6000_attribute_table): Update for above
changes, using...
(rs6000_gnu_attributes, rs6000_gnu_attribute_table): ...these new
globals.
gcc/ada/
* gcc-interface/gigi.h (gnat_internal_attribute_table): Change
type to scoped_attribute_specs.
* gcc-interface/utils.cc (gnat_internal_attribute_table): Likewise,
using...
(gnat_internal_attributes): ...this as the underlying array.
* gcc-interface/misc.cc (gnat_attribute_table): New global.
(LANG_HOOKS_ATTRIBUTE_TABLE): Use it.
gcc/c-family/
* c-common.h (c_common_attribute_table): Replace with...
(c_common_gnu_attribute_table): ...this.
(c_common_format_attribute_table): Change type to
scoped_attribute_specs.
* c-attribs.cc (c_common_attribute_table): Replace with...
(c_common_gnu_attributes, c_common_gnu_attribute_table): ...these
new globals.
(c_common_format_attribute_table): Change type to
scoped_attribute_specs, using...
(c_common_format_attributes): ...this as the underlying array.
gcc/c/
* c-tree.h (std_attribute_table): Declare.
* c-decl.cc (std_attribute_table): Change type to
scoped_attribute_specs, using...
(std_attributes): ...this as the underlying array.
(c_init_decl_processing): Remove call to register_scoped_attributes.
* c-objc-common.h (c_objc_attribute_table): New global.
(LANG_HOOKS_ATTRIBUTE_TABLE): Use it.
(LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE): Delete.
gcc/cp/
* cp-tree.h (cxx_attribute_table): Delete.
(cxx_gnu_attribute_table, std_attribute_table): Declare.
* cp-objcp-common.h (LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE): Delete.
(cp_objcp_attribute_table): New table.
(LANG_HOOKS_ATTRIBUTE_TABLE): Redefine.
* tree.cc (cxx_attribute_table): Replace with...
(cxx_gnu_attributes, cxx_gnu_attribute_table): ...these globals.
(std_attribute_table): Change type to scoped_attribute_specs, using...
(std_attributes): ...this as the underlying array.
(init_tree): Remove call to register_scoped_attributes.
gcc/d/
* d-tree.h (d_langhook_attribute_table): Replace with...
(d_langhook_gnu_attribute_table): ...this.
(d_langhook_common_attribute_table): Change type to
scoped_attribute_specs.
* d-attribs.cc (d_langhook_common_attribute_table): Change type to
scoped_attribute_specs, using...
(d_langhook_common_attributes): ...this as the underlying array.
(d_langhook_attribute_table): Replace with...
(d_langhook_gnu_attributes, d_langhook_gnu_attribute_table): ...these
new globals.
(uda_attribute_p): Update accordingly, and update for new
targetm.attribute_table type.
* d-lang.cc (d_langhook_attribute_table): New global.
(LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
gcc/fortran/
* f95-lang.cc: Include attribs.h.
(gfc_attribute_table): Change to an array of scoped_attribute_specs
pointers, using...
(gfc_gnu_attributes, gfc_gnu_attribute_table): ...these new globals.
gcc/jit/
* dummy-frontend.cc (jit_format_attribute_table): Change type to
scoped_attribute_specs, using...
(jit_format_attributes): ...this as the underlying array.
(jit_attribute_table): Change to an array of scoped_attribute_specs
pointers, using...
(jit_gnu_attributes, jit_gnu_attribute_table): ...these new globals
for the original array. Include the format attributes.
(LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_ATTRIBUTE_TABLE): Define.
gcc/lto/
* lto-lang.cc (lto_format_attribute_table): Change type to
scoped_attribute_specs, using...
(lto_format_attributes): ...this as the underlying array.
(lto_attribute_table): Change to an array of scoped_attribute_specs
pointers, using...
(lto_gnu_attributes, lto_gnu_attribute_table): ...these new globals
for the original array. Include the format attributes.
(LANG_HOOKS_COMMON_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE): Delete.
(LANG_HOOKS_ATTRIBUTE_TABLE): Define.
|
|
In <https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628748.html>
I proposed -fhardened, a new umbrella option that enables a reasonable set
of hardening flags. The read of the room seems to be that the option
would be useful. So here's a patch implementing that option.
Currently, -fhardened enables:
-D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
-D_GLIBCXX_ASSERTIONS
-ftrivial-auto-var-init=zero
-fPIE -pie -Wl,-z,relro,-z,now
-fstack-protector-strong
-fstack-clash-protection
-fcf-protection=full (x86 GNU/Linux only)
-fhardened will not override options that were specified on the command line
(before or after -fhardened). For example,
-D_FORTIFY_SOURCE=1 -fhardened
means that _FORTIFY_SOURCE=1 will be used. Similarly,
-fhardened -fstack-protector
will not enable -fstack-protector-strong.
Currently, -fhardened is only supported on GNU/Linux.
In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
to anything. This patch provides -Whardened, enabled by default, which
warns when -fhardened couldn't enable a particular option. I think most
often it will say that _FORTIFY_SOURCE wasn't enabled because optimization
were not enabled.
gcc/c-family/ChangeLog:
* c-opts.cc: Include "target.h".
(c_finish_options): Maybe cpp_define _FORTIFY_SOURCE
and _GLIBCXX_ASSERTIONS.
gcc/ChangeLog:
* common.opt (Whardened, fhardened): New options.
* config.in: Regenerate.
* config/bpf/bpf.cc: Include "opts.h".
(bpf_option_override): If flag_stack_protector_set_by_fhardened_p, do
not inform that -fstack-protector does not work.
* config/i386/i386-options.cc (ix86_option_override_internal): When
-fhardened, maybe enable -fcf-protection=full.
* config/linux-protos.h (linux_fortify_source_default_level): Declare.
* config/linux.cc (linux_fortify_source_default_level): New.
* config/linux.h (TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL): Redefine.
* configure: Regenerate.
* configure.ac: Check if the linker supports '-z now' and '-z relro'.
Check if -fhardened is supported on $target_os.
* doc/invoke.texi: Document -fhardened and -Whardened.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL): Add.
* gcc.cc (driver_handle_option): Remember if any link options or -static
were specified on the command line.
(process_command): When -fhardened, maybe enable -pie and
-Wl,-z,relro,-z,now.
* opts.cc (flag_stack_protector_set_by_fhardened_p): New global.
(finish_options): When -fhardened, enable
-ftrivial-auto-var-init=zero and -fstack-protector-strong.
(print_help_hardened): New.
(print_help): Call it.
* opts.h (flag_stack_protector_set_by_fhardened_p): Declare.
* target.def (fortify_source_default_level): New target hook.
* targhooks.cc (default_fortify_source_default_level): New.
* targhooks.h (default_fortify_source_default_level): Declare.
* toplev.cc (process_options): When -fhardened, enable
-fstack-clash-protection. If flag_stack_protector_set_by_fhardened_p,
do not warn that -fstack-protector not supported for this target.
Don't enable -fhardened when !HAVE_FHARDENED_SUPPORT.
gcc/testsuite/ChangeLog:
* gcc.misc-tests/help.exp: Test -fhardened.
* c-c++-common/fhardened-1.S: New test.
* c-c++-common/fhardened-1.c: New test.
* c-c++-common/fhardened-10.c: New test.
* c-c++-common/fhardened-11.c: New test.
* c-c++-common/fhardened-12.c: New test.
* c-c++-common/fhardened-13.c: New test.
* c-c++-common/fhardened-14.c: New test.
* c-c++-common/fhardened-15.c: New test.
* c-c++-common/fhardened-2.c: New test.
* c-c++-common/fhardened-3.c: New test.
* c-c++-common/fhardened-4.c: New test.
* c-c++-common/fhardened-5.c: New test.
* c-c++-common/fhardened-6.c: New test.
* c-c++-common/fhardened-7.c: New test.
* c-c++-common/fhardened-8.c: New test.
* c-c++-common/fhardened-9.c: New test.
* gcc.target/i386/cf_check-6.c: New test.
|
|
Add target data to indicate if libatomic is available.
gcc/ChangeLog:
* config/rtems.h (TARGET_HAVE_LIBATOMIC): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_HAVE_LIBATOMIC): Add.
* target.def (have_libatomic): New.
|
|
This reverts commit 8cdcea51c0fd753e6a652c9b236e91b3a6e0911c.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Do not define
__LIBGCC_GCOV_TYPE_SIZE.
gcc/ChangeLog:
* config/sparc/rtemself.h (SPARC_GCOV_TYPE_SIZE): Remove.
* config/sparc/sparc.cc (sparc_gcov_type_size): Likewise.
(TARGET_GCOV_TYPE_SIZE): Likewise.
* coverage.cc (get_gcov_type): Use LONG_LONG_TYPE_SIZE instead
of removed target hook.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_GCOV_TYPE_SIZE): Remove.
* target.def: Likewise.
* targhooks.cc (default_gcov_type_size): Likewise.
* targhooks.h (default_gcov_type_size): Likewise.
libgcc/ChangeLog:
* libgcov.h (gcov_type): Use LONG_LONG_TYPE_SIZE.
(gcov_type_unsigned): Likewise.
|
|
This patch adds a way for targets to ask that selected mode changes
be brought forward, through a combination of:
(1) requiring a mode in blocks where the entity was previously
transparent
(2) pushing the transition at the head of a block onto incomging edges
SME has two uses for this:
- A "one-shot" entity that, for any given path of execution,
either stays off or makes exactly one transition from off to on.
This relies only on (1) above; see the hook description for more info.
The main purpose of using mode-switching for this entity is to
shrink-wrap the code that requires it.
- A second entity for which all transitions must be from known
modes, which is enforced using a combination of (1) and (2).
More specifically, (1) looks for edges B1->B2 for which:
- B2 requires a specific mode and
- B1 does not guarantee a specific starting mode
In this system, such an edge is only possible if the entity is
transparent in B1. (1) then forces B1 to require some safe common
mode. Applying this inductively means that all incoming edges are
from known modes. If different edges give different starting modes,
(2) pushes the transitions onto the edges themselves; this only
happens if the entity is not transparent in some predecessor block.
The patch also uses the back-propagation as an excuse to do a simple
on-the-fly optimisation.
Hopefully the comments in the patch explain things a bit better.
gcc/
* target.def (mode_switching.backprop): New hook.
* doc/tm.texi.in (TARGET_MODE_BACKPROP): New @hook.
* doc/tm.texi: Regenerate.
* mode-switching.cc (struct bb_info): Add single_succ.
(confluence_info): Add transp field.
(single_succ_confluence_n, single_succ_transfer): New functions.
(backprop_confluence_n, backprop_transfer): Likewise.
(optimize_mode_switching): Use them. Push mode transitions onto
a block's incoming edges, if the backprop hook requires it.
|
|
The mode-switching pass assumed that all of an entity's modes
were mutually exclusive. However, the upcoming SME changes
have an entity with some overlapping modes, so that there is
sometimes a "superunion" mode that contains two given modes.
We can use this relationship to pass something more helpful than
"don't know" to the emit hook.
This patch adds a new hook that targets can use to specify
a mode confluence operator.
With mutually exclusive modes, it's possible to compute a block's
incoming and outgoing modes by looking at its availability sets.
With the confluence operator, we instead need to solve a full
dataflow problem.
However, when emitting a mode transition, the upcoming SME use of
mode-switching benefits from having as much information as possible
about the starting mode. Calculating this information is definitely
worth the compile time.
The dataflow problem is written to work before and after the LCM
problem has been solved. A later patch makes use of this.
While there (since git blame would ping me for the reindented code),
I used a lambda to avoid the cut-&-pasted loops.
gcc/
* target.def (mode_switching.confluence): New hook.
* doc/tm.texi (TARGET_MODE_CONFLUENCE): New @hook.
* doc/tm.texi.in: Regenerate.
* mode-switching.cc (confluence_info): New variable.
(mode_confluence, forward_confluence_n, forward_transfer): New
functions.
(optimize_mode_switching): Use them to calculate mode_in when
TARGET_MODE_CONFLUENCE is defined.
|
|
This patch passes the set of live hard registers to the after hook,
like the previous one did for the needed hook.
gcc/
* target.def (mode_switching.after): Add a regs_live parameter.
* doc/tm.texi: Regenerate.
* config/epiphany/epiphany-protos.h (epiphany_mode_after): Update
accordingly.
* config/epiphany/epiphany.cc (epiphany_mode_needed): Likewise.
(epiphany_mode_after): Likewise.
* config/i386/i386.cc (ix86_mode_after): Likewise.
* config/riscv/riscv.cc (riscv_mode_after): Likewise.
* config/sh/sh.cc (sh_mode_after): Likewise.
* mode-switching.cc (optimize_mode_switching): Likewise.
|
|
The emit hook already takes the set of live hard registers as input.
This patch passes it to the needed hook too. SME uses this to
optimise the mode choice based on whether state is live or dead.
The main caller already had access to the required info, but the
special handling of return values did not.
gcc/
* target.def (mode_switching.needed): Add a regs_live parameter.
* doc/tm.texi: Regenerate.
* config/epiphany/epiphany-protos.h (epiphany_mode_needed): Update
accordingly.
* config/epiphany/epiphany.cc (epiphany_mode_needed): Likewise.
* config/epiphany/mode-switch-use.cc (insert_uses): Likewise.
* config/i386/i386.cc (ix86_mode_needed): Likewise.
* config/riscv/riscv.cc (riscv_mode_needed): Likewise.
* config/sh/sh.cc (sh_mode_needed): Likewise.
* mode-switching.cc (optimize_mode_switching): Likewise.
(create_pre_exit): Likewise, using the DF simulate functions
to calculate the required information.
|
|
The mode-switching pass already had hooks to say what mode
an entity is in on entry to a function and what mode it must
be in on return. For SME, we also want to say what mode an
entity is guaranteed to be in on entry to an exception handler.
gcc/
* target.def (mode_switching.eh_handler): New hook.
* doc/tm.texi.in (TARGET_MODE_EH_HANDLER): New @hook.
* doc/tm.texi: Regenerate.
* mode-switching.cc (optimize_mode_switching): Use eh_handler
to get the mode on entry to an exception handler.
|