Age | Commit message (Collapse) | Author | Files | Lines |
|
-mtune=intel is used to generate a single binary to run well on both big
core and small core, similar to hybrid CPUs. Update -mtune=intel to tune
for Diamond Rapids and Clearwater Forest, instead of Silvermont.
PR target/120815
* common/config/i386/i386-common.cc (processor_alias_table):
Replace CPU_SLM/PTA_NEHALEM with CPU_HASWELL/PTA_HASWELL for
PROCESSOR_INTEL.
* config/i386/i386-options.cc (processor_cost_table): Replace
intel_cost with alderlake_cost.
* config/i386/x86-tune-costs.h (intel_cost): Removed.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Treat
PROCESSOR_INTEL like PROCESSOR_ALDERLAKE.
(ix86_adjust_cost): Likewise.
* doc/invoke.texi: Update -mtune=intel for Diamond Rapids and
Clearwater Forest.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
|
|
This patch adds support for the RISC-V Profiles RVA23S64 and RVB23S64.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New Profiles.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-rva23s.c: New test.
* gcc.target/riscv/arch-rvb23s.c: New test.
|
|
With the middle-end providing a way to make vectorization more profitable by
scaling vect-scalar-cost-multiplier this makes a more user friendly option
to make it easier to use.
I propose making it an actual -m option that we document and retain vs using
the parameter name. In the future I would like to extend this option to modify
additional costing in the AArch64 backend itself.
This can be used together with --param aarch64-autovec-preference to get the
vectorizer to say, always vectorize with SVE. I did consider making this an
additional enum to --param aarch64-autovec-preference but I also think this is
a useful thing to be able to set with pragmas and attributes, but am open to
suggestions.
Note that as a follow up I plan on extending -fdump-tree-vect to support -stats
which is then intended to be usable with this flag.
gcc/ChangeLog:
* config/aarch64/aarch64.opt (max-vectorization): New.
* config/aarch64/aarch64.cc (aarch64_override_options_internal): Save
and restore option.
Implement it through vect-scalar-cost-multiplier.
(aarch64_attributes): Default to off.
* common/config/aarch64/aarch64-common.cc (aarch64_handle_option):
Initialize option.
* doc/extend.texi (max-vectorization): Document attribute.
* doc/invoke.texi (max-vectorization): Document flag.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/cost_model_17.c: New test.
* gcc.target/aarch64/sve/cost_model_18.c: New test.
|
|
Add b-ext in RVA/B23 as independent extension flags and add supm in
RVA23.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add b-ext and supm.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-53.c: Update testcase.
|
|
It's new C++ language feature introduced in C++17, which is higher than
the build environment required by the GCC (C++14).
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Remove structured binding
from the code.
|
|
This patch allows an -march string like
-march=sifive-p670
in order override a previous -march in a simple way.
Suppose we have a Makefile that specifies -march=rv64gc by default.
A user-specified -mcpu=sifive-p670 would be after the -march in the
options string and thus only set -mtune=sifive-p670 (as -mcpu does not
override a previously specified -march or -mtune).
So if we wanted to override we would need to specify the full, lengthy
-march=rv64gcv_... string instead of a simple -mcpu=...
Therefore this patch always first tries to interpret -march= as CPU
string. If it is a supported CPU we use its march properties and let it
override previously specified options. Otherwise the behavior is as
before. This enables the "last-specified option wins" behavior GCC
normally employs.
Note that -march does not imply -mtune like on x86 or other targets.
So an -march=CPU won't override a previously specified -mtune=other-CPU.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_subset_list::parse_base_ext):
Adjust error message.
(riscv_handle_option): Parse as CPU string first.
(riscv_expand_arch): Ditto.
* doc/invoke.texi: Document.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-56.c: New test.
|
|
During the GCC compilation, some warnings about temporary object dangling
references emerged. They appeared in these code lines in riscv-common.cc:
const riscv_ext_info_t &implied_ext_info, const riscv_ext_info_t &ext_info = get_riscv_ext_info (ext) and auto &ext_info = get_riscv_ext_info (search_ext).
The issue arose because the local variable types were not used in a standardized
way, causing their references to dangle once the function ended.
To fix this, the patch changes the argument type of get_riscv_ext_info to
`const char *`, thereby eliminating the warnings.
Changes for v2:
- Change the argument type of get_riscv_ext_info to `const char *` to eliminate the warnings.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (get_riscv_ext_info): Fix argument type.
(riscv_subset_list::check_implied_ext): Type conversion.
|
|
As we mentioned in GCC 15, we will remove avx10.1-256/512 and evex512
in GCC 16. Also, the combination of AVX10 and AVX512 option behavior
will also be simplified in GCC 16 since AVX10.1 now implied AVX512,
making the behavior matching everyone else.
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_available_features): Remove feature set for AVX10_1_256.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_EVEX512_SET): Removed.
(OPTION_MASK_ISA2_AVX10_1_256_SET): Removed.
(OPTION_MASK_ISA_AVX10_1_SET): Imply all AVX512 features.
(OPTION_MASK_ISA2_AVX10_1_SET): Ditto.
(OPTION_MASK_ISA2_AVX2_UNSET): Remove AVX10_1_UNSET.
(OPTION_MASK_ISA2_EVEX512_UNSET): Removed.
(OPTION_MASK_ISA2_AVX10_1_UNSET): Remove AVX10_1_256.
(OPTION_MASK_ISA2_AVX512F_UNSET): Unset AVX10_1.
(ix86_handle_option): Remove special handling for AVX512/AVX10.1
options, evex512 and avx10_1_256. Modify ISA set for AVX10 options.
* common/config/i386/i386-cpuinfo.h
(enum feature_priority): Remove P_AVX10_1_256.
(enum processor_features): Remove FEATURE_AVX10_1_256.
* common/config/i386/i386-isas.h: Remove avx10.1-256/512.
* config/i386/avx512bf16intrin.h: Rollback target push before
evex512 is introduced.
* config/i386/avx512bf16vlintrin.h: Ditto.
* config/i386/avx512bitalgintrin.h: Ditto.
* config/i386/avx512bitalgvlintrin.h: Ditto.
* config/i386/avx512bwintrin.h: Ditto.
* config/i386/avx512cdintrin.h: Ditto.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/avx512fp16intrin.h: Ditto.
* config/i386/avx512fp16vlintrin.h: Ditto.
* config/i386/avx512ifmaintrin.h: Ditto.
* config/i386/avx512ifmavlintrin.h: Ditto.
* config/i386/avx512vbmi2intrin.h: Ditto.
* config/i386/avx512vbmi2vlintrin.h: Ditto.
* config/i386/avx512vbmiintrin.h: Ditto.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vldqintrin.h: Ditto.
* config/i386/avx512vlintrin.h: Ditto.
* config/i386/avx512vnniintrin.h: Ditto.
* config/i386/avx512vnnivlintrin.h: Ditto.
* config/i386/avx512vp2intersectintrin.h: Ditto.
* config/i386/avx512vp2intersectvlintrin.h: Ditto.
* config/i386/avx512vpopcntdqintrin.h: Ditto.
* config/i386/avx512vpopcntdqvlintrin.h: Ditto.
* config/i386/gfniintrin.h: Ditto.
* config/i386/vaesintrin.h: Ditto.
* config/i386/vpclmulqdqintrin.h: Ditto.
* config/i386/driver-i386.cc (check_avx512_features): Removed.
(host_detect_local_cpu): Remove -march=native special handling.
* config/i386/i386-builtins.cc
(ix86_vectorize_builtin_gather): Remove TARGET_EVEX512.
* config/i386/i386-c.cc
(ix86_target_macros_internal): Remove EVEX512 and AVX10_1_256.
* config/i386/i386-expand.cc
(ix86_valid_mask_cmp_mode): Remove TARGET_EVEX512.
(ix86_expand_int_sse_cmp): Ditto.
(ix86_vector_duplicate_simode_const): Ditto.
(ix86_expand_vector_init_duplicate): Ditto.
(ix86_expand_vector_init_one_nonzero): Ditto.
(ix86_emit_swsqrtsf): Ditto.
(ix86_vectorize_vec_perm_const): Ditto.
(ix86_expand_vecop_qihi2): Ditto.
(ix86_expand_sse2_mulvxdi3): Ditto.
(ix86_gen_bcst_mem): Ditto.
* config/i386/i386-isa.def (EVEX512): Removed.
(AVX10_1_256): Ditto.
* config/i386/i386-options.cc
(isa2_opts): Remove evex512 and avx10.1-256.
(ix86_function_specific_save): Remove no_avx512_explicit and
no_avx10_1_explicit.
(ix86_function_specific_restore): Ditto.
(ix86_valid_target_attribute_inner_p): Remove evex512 and
avx10.1-256/512.
(ix86_valid_target_attribute_tree): Remove special handling
to rerun ix86_option_override_internal for AVX10.1-256.
(ix86_option_override_internal): Remove warning handling.
(ix86_simd_clone_adjust): Remove evex512.
* config/i386/i386.cc
(type_natural_mode): Remove TARGET_EVEX512.
(ix86_return_in_memory): Ditto.
(standard_sse_constant_p): Ditto.
(standard_sse_constant_opcode): Ditto.
(ix86_get_ssemov): Ditto.
(ix86_legitimate_constant_p): Ditto.
(ix86_vectorize_builtin_scatter): Ditto.
(ix86_hard_regno_mode_ok): Ditto.
(ix86_set_reg_reg_cost): Ditto.
(ix86_rtx_costs): Ditto.
(ix86_vector_mode_supported_p): Ditto.
(ix86_preferred_simd_mode): Ditto.
(ix86_autovectorize_vector_modes): Ditto.
(ix86_get_mask_mode): Ditto.
(ix86_simd_clone_compute_vecsize_and_simdlen): Ditto.
(ix86_simd_clone_usable): Ditto.
* config/i386/i386.h (BIGGEST_ALIGNMENT): Ditto.
(MOVE_MAX): Ditto.
(STORE_MAX_PIECES): Ditto.
(PTA_SKYLAKE_AVX512): Remove PTA_EVEX512.
(PTA_CANNONLAKE): Ditto.
(PTA_ZNVER4): Ditto.
(PTA_GRANITERAPIDS): Use PTA_AVX10_1.
(PTA_DIAMONDRAPIDS): Use PTA_GRANITERAPIDS.
* config/i386/i386.md: Remove TARGET_EVEX512, avx512f_512
and avx512bw_512.
* config/i386/i386.opt: Remove ix86_no_avx512_explicit,
ix86_no_avx10_1_explicit, mevex512, mavx10.1-256/512 and
warning for mavx10.1. Modify option comment.
* config/i386/i386.opt.urls: Remove evex512 and avx10.1-256/512.
* config/i386/predicates.md: Remove TARGET_EVEX512.
* config/i386/sse.md: Ditto.
* doc/extend.texi: Remove avx10.1-256/512. Modify avx10.1 doc.
* doc/invoke.texi: Remove avx10.1-256/512 and evex512.
* doc/sourcebuild.texi: Remove avx10.1-256/512.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_1-1.c: Remove warning.
* gcc.target/i386/avx10_1-2.c: Ditto.
* gcc.target/i386/avx10_1-3.c: Ditto.
* gcc.target/i386/avx10_1-4.c: Ditto.
* gcc.target/i386/pr111068.c: Ditto.
* gcc.target/i386/pr117946.c: Ditto.
* gcc.target/i386/pr117240_avx512f.c: Remove -mevex512 and
warning.
* gcc.target/i386/avx10_1-11.c: Rename to ...
* gcc.target/i386/avx10_1-5.c: ... this. Remove warning.
* gcc.target/i386/avx10_1-12.c: Rename to ...
* gcc.target/i386/avx10_1-6.c: ... this. Remove warning.
* gcc.target/i386/avx10_1-26.c: Rename to ...
* gcc.target/i386/avx10_1-7.c: ... this. Remove warning.
The origin avx10_1-7.c is removed.
* gcc.target/i386/avx10_1-10.c: Removed.
* gcc.target/i386/avx10_1-13.c: Removed.
* gcc.target/i386/avx10_1-14.c: Removed.
* gcc.target/i386/avx10_1-15.c: Removed.
* gcc.target/i386/avx10_1-16.c: Removed.
* gcc.target/i386/avx10_1-17.c: Removed.
* gcc.target/i386/avx10_1-18.c: Removed.
* gcc.target/i386/avx10_1-19.c: Removed.
* gcc.target/i386/avx10_1-20.c: Removed.
* gcc.target/i386/avx10_1-21.c: Removed.
* gcc.target/i386/avx10_1-22.c: Removed.
* gcc.target/i386/avx10_1-23.c: Removed.
* gcc.target/i386/avx10_1-8.c: Removed.
* gcc.target/i386/avx10_1-9.c: Removed.
* gcc.target/i386/noevex512-1.c: Removed.
* gcc.target/i386/noevex512-2.c: Removed.
* gcc.target/i386/noevex512-3.c: Removed.
* gcc.target/i386/pr111889.c: Removed.
* gcc.target/i386/pr111907.c: Removed.
|
|
We forgot to initialize m_allow_adding_dup in the constructor of
riscv_subset_list, then that will be a random value...that will lead
to a random behavior of the -march may accpet duplicate extension.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::riscv_subset_list): Init m_allow_adding_dup.
Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
Refactor extension flag handling by removing the old riscv_ext_flag_table and
sourcing all flag definitions directly from the flags field of the unified
riscv_ext_info_t structures generated from riscv-ext.def.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_extra_ext_flag_table_t):
New.
(riscv_ext_flag_table): Rename to ...
(riscv_extra_ext_flag_table): this, and drop most of definitions
that can obtained from the flags field of the riscv_ext_info_t
structures.
(apply_extra_extension_flags): Use riscv_ext_info_t.
(riscv_ext_is_subset): Ditto.
|
|
This commit drops the riscv_ext_version_table and instead uses the
riscv_ext_info_t data structure to provide the version information
for RISC-V extensions.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
Remove.
(standard_extensions_p): Use riscv_ext_info_t.
(get_default_version): Use riscv_ext_info_t.
(riscv_arch_help): Ditto.
|
|
riscv_ext_info_t data
Consolidate implied-extension logic by removing the old `riscv_implied_info`
array and using the `implied_exts` field in the unified riscv_ext_info_t
structures generated from `riscv-ext.def`.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_implied_info::riscv_implied_info_t): Remove unused
variant.
(struct riscv_implied_info_t): Remove unsued field.
(riscv_implied_info::match): Remove unused variant, and adjust
the logic.
(get_riscv_ext_info): New.
(riscv_implied_info): Remove.
(riscv_ext_info_t::apply_implied_ext): New.
(riscv_combine_info). Remove.
(riscv_subset_list::handle_implied_ext): Use riscv_ext_info_t
rather than riscv_implied_info.
(riscv_subset_list::check_implied_ext): Ditto.
(riscv_subset_list::handle_combine_ext): Use riscv_ext_info_t
rather than riscv_combine_info.
(riscv_minimal_hwprobe_feature_bits): Use riscv_ext_info_t
rather than riscv_implied_info.
|
|
Define a new riscv_ext_info_t struct to aggregate all ISA extension fields
(name, version, flags, implied extensions, bitmask and extra flags) generated
from riscv-ext.def.
Also adjust riscv_ext_flag_table_t and riscv_implied_info_t to make it
able to not hold extension name, this part will refactor in later
patchs.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_ext_info_t): New
struct.
(opt_var_ref_t): Adjust order.
(cl_opt_var_ref_t): Ditto.
(riscv_ext_flag_table_t): Adjust order, and add a new construct
that not hold the extension name.
(riscv_version_t): New struct.
(riscv_implied_info_t): Adjust order, and add a new construct that not
hold the extension name.
(apply_extra_extension_flags): New function.
(riscv_ext_infos): New.
(riscv_implied_info): Adjust.
* config/riscv/riscv-opts.h (EXT_FLAG_MACRO): New macro.
(BITMASK_NOT_YET_ALLOCATED): New macro.
|
|
We don't hold any extenison flags in `target_flags`, so no need to
gather the extenison flags in `target_flags`.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_can_inline_p): Drop
extension flags check from `target_flags`.
* config/riscv/riscv-subset.h (riscv_x_target_flags_isa_mask):
Remove.
* config/riscv/riscv.cc (riscv_x_target_flags_isa_mask): Remove.
|
|
Leverage the centralized riscv-ext.def definitions to auto-generate
the target option parsing and associated internal flags, replacing
manual listings in riscv.opt; `riscv_ext_flag_table` part will remove in
later patch.
gcc/ChangeLog:
* config/riscv/gen-riscv-ext-opt.cc: New.
* config/riscv/riscv.opt: Drop manual entries for target
options, and include riscv-ext.opt.
* config/riscv/riscv-ext.opt: New.
* config/riscv/riscv-ext.opt.urls: New.
* config.gcc: Add riscv-ext.opt to the list of target options files.
* common/config/riscv/riscv-common.cc (riscv_ext_flag_table): Adjsut target
option variable entry.
(riscv_set_arch_by_subset_list): Adjust target option variable.
* config/riscv/riscv-c.cc (riscv_ext_flag_table): Adjust target
option variable entry.
* config/riscv/riscv-vector-builtins.cc (pragma_intrinsic_flags):
Adjust variable name.
(riscv_pragma_intrinsic_flags_pollute): Adjust variable name.
(riscv_pragma_intrinsic_flags_restore): Ditto.
* config/riscv/t-riscv: Add the rule for generating
riscv-ext.opt.
* config/riscv/riscv-opts.h (TARGET_MIN_VLEN): Update.
(TARGET_MIN_VLEN_OPTS): Update.
|
|
This patch support ssnpm, smnpm, smmpm, sspm and supm extensions[1].
To enable GCC to recognize and process ssnpm, smnpm, smmpm, sspm and
supm extensions correctly at compile time.
[1]https://github.com/riscv/riscv-j-extension/blob/master/zjpm/instructions.adoc
Changes for v5:
- Fix the testsuite error in arch-50.c.
Changes for v4:
- Fix the code based on the commit id 9b13bea07706a7cae0185f8a860d67209308c050.
Changes for v3:
- Fix the error messages in gcc/testsuite/gcc.target/riscv/arch-46.c
Changes for v2:
- Add the sspm and supm extensions.
- Add the check_conflict_ext function to check the compatibility of ssnpm, smnpm, smmpm, sspm and supm extensions.
- Add the test cases for ssnpm, smnpm, smmpm, sspm and supm extensions.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::check_conflict_ext): New extension.
* config/riscv/riscv.opt: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-ss-1.c: New test.
* gcc.target/riscv/arch-ss-2.c: New test.
|
|
This patch support zilsd and zclsd[1] extensions.
To enable GCC to recognize and process zilsd and zclsd extension correctly at compile time.
[1] https://github.com/riscv/riscv-zilsd
Changes for v2:
- Remove the addition of zilsd extension in gcc/common/config/riscv/riscv-ext-bitmask.def
- Fix a bug with zilsd and zclsd extension dependency in gcc/common/config/riscv/riscv-common.cc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::check_conflict_ext): New extension.
* config/riscv/riscv.opt: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-zilsd-1.c: New.
* gcc.target/riscv/arch-zilsd-2.c: New.
* gcc.target/riscv/arch-zilsd-3.c: New.
|
|
This patch introduces support for RISC-V Profiles RV23A and RV23B [1],
enabling developers to utilize these profiles through the -march option.
[1] https://github.com/riscv/riscv-profiles/releases/tag/rva23-rvb23-ratified
Version log:
Update the testcases, using lowercase letter.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New profile.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-53.c: New test.
* gcc.target/riscv/arch-54.c: New test.
|
|
This patch introduces support for RISC-V Profiles RV20 and RV22 [1],
enabling developers to utilize these profiles through the -march option.
[1] https://github.com/riscv/riscv-profiles/releases/tag/v1.0
Version log:
Using lowercase letters to present Profiles.
Using '_' as divsor between Profiles and other RISC-V extension.
Add descriptions in invoke.texi.
Checking if there exist '_' between Profiles and additional extensions.
Using std::string to avoid memory problems.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (struct riscv_profiles): New struct.
(riscv_subset_list::parse_profiles): New parser.
(riscv_subset_list::parse_base_ext): Ditto.
* config/riscv/riscv-subset.h: New def.
* doc/invoke.texi: New option descriptions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-49.c: New test.
* gcc.target/riscv/arch-50.c: New test.
* gcc.target/riscv/arch-51.c: New test.
* gcc.target/riscv/arch-52.c: New test.
|
|
This patch support zama16b extension[1].
To enable GCC to recognize and process zama16b extension correctly at compile time.
[1] https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-48.c: New test.
|
|
This patch support sdtrig and ssstrict extensions[1].
To enable GCC to recognize and process sdtrig and ssstrict extensions correctly
at compile time.
[1] https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New extension.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-47.c: New test.
|
|
This patch support svadu and svade extension.
To enable GCC to recognize and process svadu and svade extension correctly at compile time.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_ext_version_table): New
extension.
(riscv_ext_flag_table) Ditto.
* config/riscv/riscv.opt: New mask.
* doc/invoke.texi (RISC-V Options): New extension
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-45.c: New test.
* gcc.target/riscv/arch-46.c: New test.
|
|
The Zve32x extension depends on the Zicsr extension.
Currently, enabling Zve32x alone does not automatically imply Zicsr in GCC.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add Zve32x depends on Zicsr
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-19.c: set the march to rv64im_zve32x
instead of rv64gc_zve32x to avoid Zicsr implied by g. Extra m is
added to avoid current 'V' extension requires 'M' extension
Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
|
|
gcc/ChangeLog:
PR target/119549
* common/config/i386/i386-common.cc (ix86_handle_option):
Refactor msse4 and mno-sse4.
* config/i386/i386.opt (msse4): Remove RejectNegative.
(mno-sse4): Remove the entry.
* config/i386/i386-options.cc
(ix86_valid_target_attribute_inner_p): Remove special code
which handles mno-sse4.
|
|
GCC must imply C extension from Zca extension when it's
possible. It's necessary for achieving compatibility
between different march strings which in fact may be
the same.
E.g., if rv32ic multilib configuration is presented in
GCC, then GCC will not choose this configuration for
linking if -march=rv32i_zca is passed.
Here is a more practical example. From RISC-V
Instruction Set Manual:
Therefore common ISA strings can be updated as follows
to include the relevant Zc extensions, for example:
- RV32IMC becomes RV32IM_Zce
- RV32IMCF becomes RV32IMF_Zce
With current implication rules this will not work well
if rv32imc configuration is presented and a user
passes -march=rv32im_zce. This is how we can check
this with a simple empty test.c source file:
$ riscv64-unknown-elf-gcc -march=rv32ic -mabi=ilp32 -mriscv-attribute -S test.c
$ grep "attribute arch" test.s
.attribute arch, "rv32i2p1_c2p0_zca1p0"
$ riscv64-unknown-elf-gcc -march=rv32i_zce -mabi=ilp32 -mriscv-attribute -S test.c
$ grep "attribute arch" test.s
.attribute arch, "rv32i2p1_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0"
According to current GCC these march strings are
incompatible: the first one contains c2p0 and the
second on doesn't.
To introduce such implication rule we need to carefully
cover all possible combinations with these extensions:
zca, zcf, zcd, F and D.
According to the same manual:
As C defines the same instructions as Zca, Zcf and
Zcd, the rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd
Here is a full list of cases:
1. rv32i_zca implies C.
2. rv32if_zca_zcf implies C.
3. rv32ifd_zca_zcf_zcd implies C.
4. rv64i_zca implies C.
5. rv64ifd_zca_zcd implies C.
PR target/119122
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_implied_info): Add a rule
for Zca to C implication.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-25.c: Fix dg-error expectation.
* gcc.target/riscv/attribute-c-1.c: New test.
* gcc.target/riscv/attribute-c-2.c: New test.
* gcc.target/riscv/attribute-c-3.c: New test.
* gcc.target/riscv/attribute-c-4.c: New test.
* gcc.target/riscv/attribute-c-5.c: New test.
* gcc.target/riscv/attribute-c-6.c: New test.
* gcc.target/riscv/attribute-c-7.c: New test.
* gcc.target/riscv/attribute-c-8.c: New test.
* gcc.target/riscv/attribute-zce-1.c: Update Zce tests.
* gcc.target/riscv/attribute-zce-2.c: Likewise.
* gcc.target/riscv/attribute-zce-3.c: Likewise
* gcc.target/riscv/attribute-zce-4.c: Likewise.
|
|
The recently announced IBM z17 processor implements the architecture
already supported as arch15. This patch adds support for z17 as an
alternative architecture name for arch15.
gcc/ChangeLog:
* common/config/s390/s390-common.cc: Rename arch15 to z17.
* config.gcc: Add z17.
* config/s390/driver-native.cc: Detect z17 machine.
* config/s390/s390-builtins.def (B_VXE3): Rename arch15 to z17.
* config/s390/s390-c.cc (s390_resolve_overloaded_builtin): Ditto.
* config/s390/s390-opts.h (enum processor_type): Ditto.
* config/s390/s390.cc: Ditto.
* config/s390/s390.h: Ditto.
* config/s390/s390.md: Ditto.
* config/s390/s390.opt: Add z17.
* doc/invoke.texi: Ditto.
|
|
The following fixes ix86_valid_target_attribute_inner_p to properly
handle target("no-sse4") via OPT_mno_sse4 rather than as unset OPT_msse4.
I've added asserts to ix86_handle_option that RejectNegative is honored
for both.
PR target/119549
* common/config/i386/i386-common.cc (ix86_handle_option):
Assert that both OPT_msse4 and OPT_mno_sse4 are never unset.
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Process negated OPT_msse4 as OPT_mno_sse4.
* gcc.target/i386/pr119549.c: New testcase.
|
|
-mavx10.1 back with 512 bit alias
When AVX10.1 options are added into GCC 14, E-core is supposed to
support up to 256 bit vector width, while P-core up to 512 bit vector
width. Therefore, we added avx10.1-256 and avx10.1-512 options into
compiler since there will be real platforms with 256 bit only support.
At the same time, for old platforms could also compile a 256 bit only
binary, we introduced -mno-evex512 to disable 512 bit vector.
However, all the future platforms will now support 512 bit vector width,
including P-core and E-core. It will result in no need for split the
option for vector width. Therefore, we will remove them in this patch.
Unlike AVX10.2 options, AVX10.1 options has been there in a major
release, so we have to raise a deprecate warning in GCC 15 and remove
them in GCC 16. At the same time, to align with avx10.2 options, we will
add just removed avx10.1 option back with warning to mention its
behavior change.
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_available_features): Change to FEATURE_AVX10_1.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_1_512_SET): Renamed to ...
(OPTION_MASK_ISA2_AVX10_1_SET): ... this.
(OPTION_MASK_ISA2_AVX10_2_SET): Use renamed macro.
(OPTION_MASK_ISA2_AVX10_1_UNSET): Ditto.
(ix86_handle_option): Ditto.
(processor_alias_table): Use P_PROC_AVX10_1.
* common/config/i386/i386-cpuinfo.h
(enum feature_priority): Rename from AVX10_1_512 to AVX10_1.
(enum processor_features): Ditto.
* common/config/i386/i386-isas.h: Add avx10.1.
* config/i386/driver-i386.cc
(host_detect_local_cpu): Use renamed enum.
* config/i386/i386-c.cc
(ix86_target_macros_internal): Rename to avx10.1.
* config/i386/i386-isa.def (AVX10_1_512): Rename to ...
(AVX10_1): ... this.
* config/i386/i386-options.cc (isa2_opts): Rename to avx10.1.
(ix86_valid_target_attribute_inner_p): Add avx10.1.
(ix86_option_override_internal): Rename to AVX10_1.
Revise warnings to mention behavior change for option
combination in GCC 16.
* config/i386/i386.h (PTA_DIAMONDRAPIDS): Use AVX10_1.
* config/i386/i386.opt: Add avx10.1.
Add deprecate warnings for mevex512 and mavx10.1-256/512.
* config/i386/i386.opt.urls: Add avx10.1.
* doc/extend.texi: Ditto.
* doc/sourcebuild.texi: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10-check.h: Change to avx10.1.
* gcc.target/i386/avx10_1-1.c: Add warning check.
* gcc.target/i386/avx10_1-10.c: Ditto.
* gcc.target/i386/avx10_1-11.c: Ditto.
* gcc.target/i386/avx10_1-12.c: Ditto.
* gcc.target/i386/avx10_1-13.c: Ditto.
* gcc.target/i386/avx10_1-15.c: Ditto.
* gcc.target/i386/avx10_1-16.c: Ditto.
* gcc.target/i386/avx10_1-18.c: Ditto.
* gcc.target/i386/avx10_1-19.c: Ditto.
* gcc.target/i386/avx10_1-2.c: Ditto.
* gcc.target/i386/avx10_1-20.c: Ditto.
* gcc.target/i386/avx10_1-21.c: Ditto.
* gcc.target/i386/avx10_1-22.c: Ditto.
* gcc.target/i386/avx10_1-23.c: Ditto.
* gcc.target/i386/avx10_1-26.c: Ditto.
* gcc.target/i386/avx10_1-3.c: Ditto.
* gcc.target/i386/avx10_1-4.c: Ditto.
* gcc.target/i386/avx10_1-7.c: Ditto.
* gcc.target/i386/avx10_1-8.c: Ditto.
* gcc.target/i386/avx10_1-9.c: Ditto.
* gcc.target/i386/noevex512-1.c: Ditto.
* gcc.target/i386/noevex512-2.c: Ditto.
* gcc.target/i386/pr111068.c: Ditto.
* gcc.target/i386/pr111907.c: Ditto.
* gcc.target/i386/pr117240_avx512f.c: Ditto.
* gcc.target/i386/pr117304-1.c: Ditto.
* gcc.target/i386/pr117946.c: Ditto.
* gcc.target/i386/avx10_1-24.c: Removed.
* gcc.target/i386/avx10_1-25.c: Removed.
* gcc.target/i386/avx10_1-5.c: Removed.
* gcc.target/i386/avx10_1-6.c: Removed.
|
|
When AVX10.2 options are added into GCC 15, E-core is supposed to
support up to 256 bit vector width, while P-core up to 512 bit vector
width. Therefore, we added avx10.2-256 and avx10.2-512 options into
compiler since there will be real platforms with 256 bit only support.
However, all the future platforms will now support 512 bit vector width,
including P-core and E-core. It will result in no need for split the
option for vector width. Therefore, we will remove them in this patch.
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_available_features): Revise the logic AVX10 version.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_2_256_SET): Removed.
(OPTION_MASK_ISA2_AVX10_2_512_SET): Ditto.
(OPTION_MASK_ISA2_AVX10_2_SET): New.
(OPTION_MASK_ISA2_AMX_AVX512_SET): Use AVX10.2 macro.
(OPTION_MASK_ISA2_AVX10_2_UNSET): Ditto.
(ix86_handle_option): Remove avx10.2-256 part. Adjust avx10.2.
* common/config/i386/i386-cpuinfo.h
(enum processor_features): Remove FEATURE_AVX10_2_256 and skip
the value for it. Change the name from FEATURE_AVX10_2_512 to
FEATURE_AVX10_2.
* common/config/i386/i386-isas.h: Remove avx10.2-256/512.
* config/i386/avx10_2-512bf16intrin.h: Use avx10.2 instead of
avx10.2-256/512.
* config/i386/avx10_2-512convertintrin.h: Ditto.
* config/i386/avx10_2-512mediaintrin.h: Ditto.
* config/i386/avx10_2-512minmaxintrin.h: Ditto.
* config/i386/avx10_2-512satcvtintrin.h: Ditto.
* config/i386/avx10_2bf16intrin.h: Ditto.
* config/i386/avx10_2convertintrin.h: Ditto.
* config/i386/avx10_2mediaintrin.h: Ditto.
* config/i386/avx10_2minmaxintrin.h: Ditto.
* config/i386/avx10_2satcvtintrin.h: Ditto.
* config/i386/movrsintrin.h: Ditto.
* config/i386/sm4intrin.h: Ditto.
* config/i386/cpuid.h (bit_AVX10_256): Removed.
(bit_AVX10_512): Ditto.
* config/i386/driver-i386.cc (host_detect_local_cpu): Adjust
Diamond Rapids and -march=native condition.
* config/i386/i386-builtin.def (BDESC): Use AVX10.2 macro
instead of AVX10.2-256/512.
* config/i386/i386-c.cc (ix86_target_macros_internal): Ditto.
* config/i386/i386-expand.cc
(ix86_expand_branch): Use TARGET_AVX10_2 instead of specifying
vector size.
(ix86_prepare_fp_compare_args): Ditto.
(ix86_expand_fp_compare): Ditto.
(ix86_ssecom_setcc): Ditto.
(ix86_expand_sse_comi): Ditto.
(ix86_expand_sse_comi_round): Ditto.
(ix86_check_builtin_isa_match): Ditto.
* config/i386/i386.cc (ix86_fp_compare_code_to_integer): Ditto.
(ix86_get_mask_mode): Ditto.
* config/i386/i386.h (SSE_FLOAT_MODE_SSEMATH_OR_HFBF_P): Ditto.
* config/i386/i386.md: Ditto.
* config/i386/mmx.md: Ditto.
* config/i386/sse.md: Ditto.
* config/i386/predicates.md: Ditto.
* config/i386/i386-isa.def (AVX10_2_256): Removed.
(AVX10_2_512): Removed.
(AVX10_2): New.
* config/i386/i386-options.cc
(isa2_opts): Remove avx10.2-256/512.
(ix86_valid_target_attribute_inner_p): Ditto.
(PTA_DIAMONDRAPIDS): Use PTA_AVX10_2.
* config/i386/i386.opt: Remove avx10.2-256/512.
* config/i386/i386.opt.urls: Ditto.
* doc/extend.texi: Ditto.
* doc/invoke.texi: Ditto.
* doc/sourcebuild.texi: Ditto.
|
|
There are occasions where knowledge about nonzero bits makes some
optimizations possible. For example,
Rd |= Rn << Off
can be implemented as
SBRC Rn, 0
ORI Rd, 1 << Off
when Rn in { 0, 1 }, i.e. nonzero_bits (Rn) == 1. This patch adds some
patterns that exploit nonzero_bits() in some combiner patterns.
As insn conditions are not supposed to contain nonzero_bits(), the patch
splits such insns right after pass insn combine.
PR target/119421
gcc/
* config/avr/avr.opt (-muse-nonzero-bits): New option.
* config/avr/avr-protos.h (avr_nonzero_bits_lsr_operands_p): New.
(make_avr_pass_split_nzb): New.
* config/avr/avr.cc (avr_nonzero_bits_lsr_operands_p): New function.
(avr_rtx_costs_1): Return costs for the new insns.
* config/avr/avr.md (nzb): New insn attribute.
(*nzb=1.<code>...): New insns to better support some bit
operations for <code> in AND, IOR, XOR.
* config/avr/avr-passes.def (avr_pass_split_nzb): Insert pass
atfer combine.
* config/avr/avr-passes.cc (avr_pass_data_split_nzb). New pass data.
(avr_pass_split_nzb): New pass.
(make_avr_pass_split_nzb): New function.
* common/config/avr/avr-common.cc (avr_option_optimization_table):
Enable -muse-nonzero-bits for -O2 and higher.
* doc/invoke.texi (AVR Options): Document -muse-nonzero-bits.
gcc/testsuite/
* gcc.target/avr/torture/pr119421-sreg.c: New test.
|
|
Change AArch64 cpuinfo to follow the latest updates to the FMV spec [1]:
Remove FEAT_PREDRES and FEAT_LS64*. Preserve the ordering in enum CPUFeatures.
[1] https://github.com/ARM-software/acle/pull/382
gcc:
* common/config/aarch64/cpuinfo.h: Remove FEAT_PREDRES and FEAT_LS64*.
* config/aarch64/aarch64-option-extensions.def: Remove FMV support
for PREDRES.
libgcc:
* config/aarch64/cpuinfo.c (__init_cpu_features_constructor):
Remove FEAT_PREDRES and FEAT_LS64* support.
|
|
Enable the early scheduler on AArch64 for O3/Ofast. This means GCC15 benefits
from much faster build times with -O2, but avoids the regressions in lbm which
is very sensitive to minor scheduling changes due to long FMA chains.
gcc:
PR target/118351
PR other/38768
* common/config/aarch64/aarch64-common.cc: Enable early scheduling with
-O3 and higher.
* doc/invoke.texi (-fschedule-insns): Update comment.
|
|
Refactor the switcher classes into two separate classes:
- sve_alignment_switcher takes the alignment switching functionality,
and is used only for ABI correctness when defining sve structure
types.
- aarch64_target_switcher takes the rest of the functionality of
aarch64_simd_switcher and sve_switcher, and gates simd/sve specific
parts upon the specified feature flags.
Additionally, aarch64_target_switcher now adds dependencies of the
specified flags (which adds +fcma and +bf16 to some intrinsic
declarations), and unsets current_target_pragma.
This last change fixes an internal bug where we would sometimes add a
user specified target pragma (stored in current_target_pragma) on top of
an internally specified target architecture while initialising
intrinsics with `#pragma GCC aarch64 "arm_*.h"`. As far as I can tell, this
has no visible impact at the moment. However, the unintended target
feature combinations lead to unwanted behaviour in an under-development
patch.
This also fixes a missing Makefile dependency, which was due to
aarch64-sve-builtins.o incorrectly depending on the undefined $(REG_H).
The correct $(REGS_H) dependency is added to the switcher's new source
location.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(struct aarch64_extension_info): Add field.
(aarch64_get_required_features): New.
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_switcher::aarch64_simd_switcher): Rename to...
(aarch64_target_switcher::aarch64_target_switcher): ...this,
and extend to handle sve, nosimd and target pragmas.
(aarch64_simd_switcher::~aarch64_simd_switcher): Rename to...
(aarch64_target_switcher::~aarch64_target_switcher): ...this,
and extend to handle sve, nosimd and target pragmas.
(handle_arm_acle_h): Use aarch64_target_switcher.
(handle_arm_neon_h): Rename switcher and pass explicit flags.
(aarch64_general_init_builtins): Ditto.
* config/aarch64/aarch64-protos.h
(class aarch64_simd_switcher): Rename to...
(class aarch64_target_switcher): ...this, and add new members.
(aarch64_get_required_features): New prototype.
* config/aarch64/aarch64-sve-builtins.cc
(sve_switcher::sve_switcher): Delete
(sve_switcher::~sve_switcher): Delete
(sve_alignment_switcher::sve_alignment_switcher): New
(sve_alignment_switcher::~sve_alignment_switcher): New
(register_builtin_types): Use alignment switcher
(init_builtins): Rename switcher.
(handle_arm_neon_sve_bridge_h): Ditto.
(handle_arm_sme_h): Ditto.
(handle_arm_sve_h): Ditto, and use alignment switcher.
* config/aarch64/aarch64-sve-builtins.h
(class sve_switcher): Delete.
(class sme_switcher): Delete.
(class sve_alignment_switcher): New.
* config/aarch64/t-aarch64 (aarch64-builtins.o): Add $(REGS_H).
(aarch64-sve-builtins.o): Remove $(REG_H).
|
|
zce must imply zcf but this rule was corrupted after
refactoring in 9e12010b5e724277ea. This may be observed
ater generating an .s file from any source code file with
-mriscv-attribute -march=rv32if_zce -mabi=ilp32 -S
options. A full march will be presented in arch attribute:
rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0
As you see, zcf is not presented here though f_zce pair is
passed in -march. According to The RISC-V Instruction
Set Manual:
Specifying Zce on RV32 with F includes Zca, Zcb, Zcmp,
Zcmt and Zcf.
PR target/118906
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: fix zce to zcf
implication.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/attribute-zce-1.c: New test.
* gcc.target/riscv/attribute-zce-2.c: New test.
* gcc.target/riscv/attribute-zce-3.c: New test.
* gcc.target/riscv/attribute-zce-4.c: New test.
|
|
As mentioned in avx10.1 option deprecate patch, based on the feedback
we got, we would like to re-alias avx10.x to 512 bit.
For -mno- options, also mentioned in the previous patch, it is confusing
what it is disabling when it comes to avx10. So we will only provide
-mno-avx10.x options from AVX10.2, disabling the whole AVX10.x.
gcc/ChangeLog:
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX10_1_UNSET): Adjust macro.
(OPTION_MASK_ISA2_AVX10_2_256_UNSET): Removed.
(OPTION_MASK_ISA2_AVX10_2_512_UNSET): Ditto.
(OPTION_MASK_ISA2_AVX10_2_UNSET): New.
(ix86_handle_option): Remove disable part for avx10.2-256.
Rename avx10.2-512 switch case to avx10.2 and adjust disable
part macro.
* common/config/i386/i386-isas.h: Adjust avx10.2 and
avx10.2-512.
* config/i386/driver-i386.cc
(host_detect_local_cpu): Do not append -mno-avx10.x-256
for -march=native.
* config/i386/i386-options.cc
(ix86_valid_target_attribute_inner_p): Adjust avx10.2 and
avx10.2-512.
* config/i386/i386.opt: Reject Negative for mavx10.2-256.
Alias mavx10.2-512 to mavx10.2. Reject Negative for
mavx10.2-512.
* doc/extend.texi: Adjust documentation.
* doc/sourcebuild.texi: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-512-vminmaxbf16-2.c:
Add missing avx10_2_512 check.
* gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto.
* gcc.target/i386/avx10-check.h: Change avx10.2 to avx10.2-256.
* gcc.target/i386/avx10_2-bf16-1.c: Ditto.
* gcc.target/i386/avx10_2-bf16-vector-cmp-1.c: Ditto.
* gcc.target/i386/avx10_2-bf16-vector-fma-1.c: Ditto.
* gcc.target/i386/avx10_2-bf16-vector-operations-1.c: Ditto.
* gcc.target/i386/avx10_2-bf16-vector-smaxmin-1.c: Ditto.
* gcc.target/i386/avx10_2-builtin-1.c: Ditto.
* gcc.target/i386/avx10_2-builtin-2.c: Ditto.
* gcc.target/i386/avx10_2-comibf-1.c: Ditto.
* gcc.target/i386/avx10_2-comibf-2.c: Ditto.
* gcc.target/i386/avx10_2-comibf-3.c: Ditto.
* gcc.target/i386/avx10_2-comibf-4.c: Ditto.
* gcc.target/i386/avx10_2-compare-1.c: Ditto.
* gcc.target/i386/avx10_2-compare-1b.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
* gcc.target/i386/avx10_2-media-1.c: Ditto.
* gcc.target/i386/avx10_2-minmax-1.c: Ditto.
* gcc.target/i386/avx10_2-movrs-1.c: Ditto.
* gcc.target/i386/avx10_2-partial-bf16-vector-fast-math-1.c: Ditto.
* gcc.target/i386/avx10_2-partial-bf16-vector-fma-1.c: Ditto.
* gcc.target/i386/avx10_2-partial-bf16-vector-operations-1.c: Ditto.
* gcc.target/i386/avx10_2-partial-bf16-vector-smaxmin-1.c: Ditto.
* gcc.target/i386/avx10_2-rounding-1.c: Ditto.
* gcc.target/i386/avx10_2-rounding-2.c: Ditto.
* gcc.target/i386/avx10_2-rounding-3.c: Ditto.
* gcc.target/i386/avx10_2-satcvt-1.c: Ditto.
* gcc.target/i386/avx10_2-vaddbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vcmpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vcomisbf16-1.c: Ditto.
* gcc.target/i386/avx10_2-vcomisbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttbf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttbf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttpd2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2dqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2qqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2udqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2uqqs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttsd2sis-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttsd2usis-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttss2sis-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttss2usis-2.c: Ditto.
* gcc.target/i386/avx10_2-vdivbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vdpphps-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmaddXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmsubXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmaddXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmsubXXXbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfpclassbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vgetexpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vgetmantbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vmaxbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto.
* gcc.target/i386/avx10_2-vmovd-1.c: Ditto.
* gcc.target/i386/avx10_2-vmovd-2.c: Ditto.
* gcc.target/i386/avx10_2-vmovw-1.c: Ditto.
* gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
* gcc.target/i386/avx10_2-vmpsadbw-2.c: Ditto.
* gcc.target/i386/avx10_2-vmulbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwsud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwusd-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwusds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpwuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vrcpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vreducebf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vrndscalebf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vrsqrtbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vscalefbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vsqrtbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vsubbf16-2.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Ditto.
* gcc.target/i386/part-vect-vec_cmpbf.c: Ditto.
* gcc.target/i386/pr117495.c: Ditto.
* gcc.target/i386/sm4-avx10_2-1.c: Ditto.
* gcc.target/i386/sm4-check.h: Ditto.
* gcc.target/i386/vnniint16-auto-vectorize-3.c: Ditto.
* gcc.target/i386/vnniint8-auto-vectorize-3.c: Ditto.
* lib/target-supports.exp: Ditto.
|
|
whole AVX10.1
Based on the feedback we got, we would like to re-alias avx10.x to 512
bit in the future. This leaves the current avx10.1 alias to 256 bit
inconsistent. Since it has been there for GCC 14.1 and GCC 14.2,
we decide to deprecate avx10.1 alias. The current proposal is not
adding it back in the future, but it might change if necessary.
For -mno- options, it is confusing what it is disabling when it comes
to avx10. Since there is barely usage enabling AVX10 with 512 bit
then disabling it, we will only provide -mno-avx10.x options in the
future, disabling the whole AVX10.x. If someone really wants to disable
512 bit after enabling it, -mavx10.x-512 -mno-avx10.x -mavx10.x-256 is
the only way to do that since we also do not want to break the usual
expression on -m- options enabling everything mentioned.
However, for avx10.1, since we deprecated avx10.1, there is no reason
we should have -mno-avx10.1. Thus, we need to keep -mno-avx10.1-[256,512].
To avoid confusion, we will make -mno-avx10.1-512 to disable the
whole AVX10.1 set to match the future -mno-avx10.x.
gcc/ChangeLog:
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVX2_UNSET): Change AVX10.1 unset macro.
(OPTION_MASK_ISA2_AVX10_1_256_UNSET): Removed.
(OPTION_MASK_ISA2_AVX10_1_512_UNSET): Removed.
(OPTION_MASK_ISA2_AVX10_1_UNSET): New.
(ix86_handle_option): Adjust AVX10.1 unset macro.
* common/config/i386/i386-isas.h: Remove avx10.1.
* config/i386/i386-options.cc
(ix86_valid_target_attribute_inner_p): Ditto.
(ix86_option_override_internal): Adjust warning message.
* config/i386/i386.opt: Remove mavx10.1.
* doc/extend.texi: Remove avx10.1 and adjust doc.
* doc/sourcebuild.texi: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10-check.h: Change to avx10.1-256.
* gcc.target/i386/avx10_1-1.c: Ditto.
* gcc.target/i386/avx10_1-13.c: Ditto.
* gcc.target/i386/avx10_1-14.c: Ditto.
* gcc.target/i386/avx10_1-21.c: Ditto.
* gcc.target/i386/avx10_1-22.c: Ditto.
* gcc.target/i386/avx10_1-23.c: Ditto.
* gcc.target/i386/avx10_1-24.c: Ditto.
* gcc.target/i386/avx10_1-3.c: Ditto.
* gcc.target/i386/avx10_1-5.c: Ditto.
* gcc.target/i386/avx10_1-6.c: Ditto.
* gcc.target/i386/avx10_1-8.c: Ditto.
* gcc.target/i386/pr117946.c: Ditto.
* gcc.target/i386/avx10_1-12.c: Adjust warning message.
* gcc.target/i386/avx10_1-19.c: Ditto.
* gcc.target/i386/avx10_1-17.c: Adjust to no-avx10.1-512.
|
|
In commit 7f1989249e25af6fc0f124452efa24b3796b767a
"[gcn] Set 'UI_NONE' for 'TARGET_EXCEPT_UNWIND_INFO' [PR94282]", we've copied
the 'UI_NONE' idea from nvptx to GCN.
I understand the intention of using 'UI_NONE' like this, and it happens to work
in a lot of cases, but there are ICEs elsewhere: code paths where we run into
'internal compiler error: in get_personality_function, at expr.cc:13512':
13494 /* Extracts the personality function of DECL and returns the corresponding
13495 libfunc. */
13496
13497 rtx
13498 get_personality_function (tree decl)
13499 {
13500 tree personality = DECL_FUNCTION_PERSONALITY (decl);
13501 enum eh_personality_kind pk;
13502
13503 pk = function_needs_eh_personality (DECL_STRUCT_FUNCTION (decl));
13504 if (pk == eh_personality_none)
13505 return NULL;
13506
13507 if (!personality
13508 && pk == eh_personality_any)
13509 personality = lang_hooks.eh_personality ();
13510
13511 if (pk == eh_personality_lang)
13512 gcc_assert (personality != NULL_TREE);
13513
13514 return XEXP (DECL_RTL (personality), 0);
13515 }
..., where 'lang_hooks.eh_personality ()' ends up calling
'gcc/expr.cc:build_personality_function', and we 'return NULL;' for 'UI_NONE':
13448 /* Build a decl for a personality function given a language prefix. */
13449
13450 tree
13451 build_personality_function (const char *lang)
13452 {
13453 const char *unwind_and_version;
13454 tree decl, type;
13455 char *name;
13456
13457 switch (targetm_common.except_unwind_info (&global_options))
13458 {
13459 case UI_NONE:
13460 return NULL;
[...]
(Comparing to nvptx' current use of 'UI_NONE', this problem (ICEs mentioned
above) is way more prevalent for GCN.)
The GCC internals documentation indeed states, 'gcc/doc/tm.texi':
@deftypefn {Common Target Hook} {enum unwind_info_type} TARGET_EXCEPT_UNWIND_INFO (struct gcc_options *@var{opts})
This hook defines the mechanism that will be used for exception handling
by the target. If the target has ABI specified unwind tables, the hook
should return @code{UI_TARGET}. If the target is to use the
@code{setjmp}/@code{longjmp}-based exception handling scheme, the hook
should return @code{UI_SJLJ}. If the target supports DWARF 2 frame unwind
information, the hook should return @code{UI_DWARF2}.
A target may, if exceptions are disabled, choose to return @code{UI_NONE}.
This may end up simplifying other parts of target-specific code. [...]
Here, note: "if exceptions are disabled" (meaning: '-fno-exceptions' etc.) may
"return @code{UI_NONE}". That's what other back ends do with code like:
/* For simplicity elsewhere in this file, indicate that all unwind
info is disabled if we're not emitting unwind tables. */
if (!opts->x_flag_exceptions && !opts->x_flag_unwind_tables)
return UI_NONE;
else
return UI_TARGET;
The corresponding "simplifying other parts of target-specific code"/
"simplicity elsewhere" would then be the early returns from
'TARGET_ASM_UNWIND_EMIT', 'ARM_OUTPUT_FN_UNWIND', etc. for
'TARGET_EXCEPT_UNWIND_INFO != UI_TARGET' (that is, for 'UI_NONE').
From the documentation (and implementation), however, it does *not* follow that
if a target doesn't implement support for exception handling, it may just set
'UI_NONE' for 'TARGET_EXCEPT_UNWIND_INFO'.
Therefore, switch to 'UI_TARGET', allocating a "fake" 'exception_section'.
With that, all these 'internal compiler error: in get_personality_function'
test cases turn into PASS, or UNSUPPORTED ('exception handling not supported'),
or re-classify into a few other, already known issues. And, this change also
happens to resolve the class of errors identified in GCC PR113331
"AMDGCN: Compilation failure due to duplicate .LEHB<n>/.LEHE<n> symbols".
(In case that use of 'UI_NONE' like originally intended really makes sense, and
is preferable over this 'UI_TARGET' solution, then more work will be necessary
for implementing the missing parts, where 'UI_NONE' currently isn't handled.)
PR target/94282
PR target/113331
gcc/
* common/config/gcn/gcn-common.cc (gcn_except_unwind_info): 'return UI_TARGET;'.
* config/gcn/gcn.cc (gcn_asm_init_sections): New function.
(TARGET_ASM_INIT_SECTIONS): '#define'.
|
|
Subversion r263265 (Git commit 77e0a97acf7b00c1e68e4738fdf275a4cffc2e50)
"[nvptx] Ignore c++ exceptions", originally had set 'UI_TARGET', but as part of
Subversion r263287 (Git commit d989dba8ef02c2406b7c9e62b352197dffc6b880)
"[c++] Don't emit exception tables for UI_NONE", then switched to 'UI_NONE'.
I understand the intention of using 'UI_NONE' like this, and it happens to work
in a lot of cases, but there are ICEs elsewhere: code paths where we run into
'internal compiler error: in get_personality_function, at expr.cc:13512':
13494 /* Extracts the personality function of DECL and returns the corresponding
13495 libfunc. */
13496
13497 rtx
13498 get_personality_function (tree decl)
13499 {
13500 tree personality = DECL_FUNCTION_PERSONALITY (decl);
13501 enum eh_personality_kind pk;
13502
13503 pk = function_needs_eh_personality (DECL_STRUCT_FUNCTION (decl));
13504 if (pk == eh_personality_none)
13505 return NULL;
13506
13507 if (!personality
13508 && pk == eh_personality_any)
13509 personality = lang_hooks.eh_personality ();
13510
13511 if (pk == eh_personality_lang)
13512 gcc_assert (personality != NULL_TREE);
13513
13514 return XEXP (DECL_RTL (personality), 0);
13515 }
..., where 'lang_hooks.eh_personality ()' ends up calling
'gcc/expr.cc:build_personality_function', and we 'return NULL;' for 'UI_NONE':
13448 /* Build a decl for a personality function given a language prefix. */
13449
13450 tree
13451 build_personality_function (const char *lang)
13452 {
13453 const char *unwind_and_version;
13454 tree decl, type;
13455 char *name;
13456
13457 switch (targetm_common.except_unwind_info (&global_options))
13458 {
13459 case UI_NONE:
13460 return NULL;
[...]
(Comparing to nvptx' current use of 'UI_NONE', this problem (ICEs mentioned
above) is way more prevalent for GCN.)
The GCC internals documentation indeed states, 'gcc/doc/tm.texi':
@deftypefn {Common Target Hook} {enum unwind_info_type} TARGET_EXCEPT_UNWIND_INFO (struct gcc_options *@var{opts})
This hook defines the mechanism that will be used for exception handling
by the target. If the target has ABI specified unwind tables, the hook
should return @code{UI_TARGET}. If the target is to use the
@code{setjmp}/@code{longjmp}-based exception handling scheme, the hook
should return @code{UI_SJLJ}. If the target supports DWARF 2 frame unwind
information, the hook should return @code{UI_DWARF2}.
A target may, if exceptions are disabled, choose to return @code{UI_NONE}.
This may end up simplifying other parts of target-specific code. [...]
Here, note: "if exceptions are disabled" (meaning: '-fno-exceptions' etc.) may
"return @code{UI_NONE}". That's what other back ends do with code like:
/* For simplicity elsewhere in this file, indicate that all unwind
info is disabled if we're not emitting unwind tables. */
if (!opts->x_flag_exceptions && !opts->x_flag_unwind_tables)
return UI_NONE;
else
return UI_TARGET;
The corresponding "simplifying other parts of target-specific code"/
"simplicity elsewhere" would then be the early returns from
'TARGET_ASM_UNWIND_EMIT', 'ARM_OUTPUT_FN_UNWIND', etc. for
'TARGET_EXCEPT_UNWIND_INFO != UI_TARGET' (that is, for 'UI_NONE').
From the documentation (and implementation), however, it does *not* follow that
if a target doesn't implement support for exception handling, it may just set
'UI_NONE' for 'TARGET_EXCEPT_UNWIND_INFO'.
Therefore, switch (back) to 'UI_TARGET', implementing some basic support for
'exception_section': discard (via a PTX comment block) whatever GCC writes into
it.
With that, all these 'internal compiler error: in get_personality_function'
test cases turn into PASS, or UNSUPPORTED ('exception handling not supported'),
or re-classify into a few other, already known issues.
(In case that use of 'UI_NONE' like originally intended really makes sense, and
is preferable over this 'UI_TARGET' solution, then more work will be necessary
for implementing the missing parts, where 'UI_NONE' currently isn't handled.)
PR target/86660
gcc/
* common/config/nvptx/nvptx-common.cc (nvptx_except_unwind_info):
'return UI_TARGET;'.
* config/nvptx/nvptx.cc (nvptx_assemble_integer): Handle
'exception_section'.
(nvptx_output_section_asm_op, nvptx_asm_init_sections): New
functions.
(TARGET_ASM_INIT_SECTIONS): '#define'.
* config/nvptx/nvptx.h (TEXT_SECTION_ASM_OP, DATA_SECTION_ASM_OP):
Don't '#define'.
(ASM_OUTPUT_DWARF_DELTA): '#define'.
|
|
This feature flag bit only exists to support the +crypto alias. Outside
of option processing this bit needs to be set or unset consistently.
This patch goes with the latter option.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc: Assert that CRYPTO
bit is not set.
* config/aarch64/aarch64-feature-deps.h
(info<FEAT>.explicit_on): Unset CRYPTO bit.
(cpu_##CORE_IDENT): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/crypto-alias-1.c: New test.
|
|
Use aarch64_validate_cpu instead of the existing duplicate (and worse)
version of the -mcpu parsing code.
The original code used fatal_error; I'm guessing that using error
instead should be ok.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(aarch64_rewrite_selected_cpu): Refactor and inline into...
(aarch64_rewrite_mcpu): this.
* config/aarch64/aarch64-protos.h
(aarch64_rewrite_selected_cpu): Delete.
|
|
Add infrastructure to allow rewriting the architecture strings passed to
the assembler (either as -march options or .arch directives). There was
already canonicalisation everywhere except for an -march driver option
passed directly to the compiler; this patch applies the same
canonicalisation there as well.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(aarch64_get_arch_string_for_assembler): New.
(aarch64_rewrite_march): New.
(aarch64_rewrite_selected_cpu): Call new function.
* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
* config/aarch64/aarch64-protos.h
(aarch64_get_arch_string_for_assembler): New.
* config/aarch64/aarch64.cc
(aarch64_declare_function_name): Call new function.
(aarch64_start_file): Ditto.
* config/aarch64/aarch64.h
(EXTRA_SPEC_FUNCTIONS): Use new macro name.
(MCPU_TO_MARCH_SPEC): Rename to...
(MARCH_REWRITE_SPEC): ...this, and extend the spec rule.
(aarch64_rewrite_march): New declaration.
(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
(AARCH64_BASE_SPEC_FUNCTIONS): ...this, and add new function.
(ASM_CPU_SPEC): Use new macro name.
|
|
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(aarch64_get_all_extension_candidates): Inline into...
(aarch64_print_hint_for_extensions): ...this.
|
|
Aside from moving the functions, the only changes are to make them
non-static, and to use the existing info arrays within aarch64-common.cc
instead of the info arrays remaining in aarch64.cc.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(aarch64_get_all_extension_candidates): Move within file.
(aarch64_print_hint_candidates): Move from aarch64.cc.
(aarch64_print_hint_for_extensions): Ditto.
(aarch64_print_hint_for_arch): Ditto.
(aarch64_print_hint_for_core): Ditto.
(enum aarch_parse_opt_result): Ditto.
(aarch64_parse_arch): Ditto.
(aarch64_parse_cpu): Ditto.
(aarch64_parse_tune): Ditto.
(aarch64_validate_march): Ditto.
(aarch64_validate_mcpu): Ditto.
(aarch64_validate_mtune): Ditto.
* config/aarch64/aarch64-protos.h
(aarch64_rewrite_selected_cpu): Move within file.
(aarch64_print_hint_for_extensions): Share function prototype.
(aarch64_print_hint_for_arch): Ditto.
(aarch64_print_hint_for_core): Ditto.
(enum aarch_parse_opt_result): Ditto.
(aarch64_validate_march): Ditto.
(aarch64_validate_mcpu): Ditto.
(aarch64_validate_mtune): Ditto.
(aarch64_get_all_extension_candidates): Unshare prototype.
* config/aarch64/aarch64.cc
(aarch64_parse_arch): Move to aarch64-common.cc.
(aarch64_parse_cpu): Ditto.
(aarch64_parse_tune): Ditto.
(aarch64_print_hint_candidates): Ditto.
(aarch64_print_hint_for_core): Ditto.
(aarch64_print_hint_for_arch): Ditto.
(aarch64_print_hint_for_extensions): Ditto.
(aarch64_validate_mcpu): Ditto.
(aarch64_validate_march): Ditto.
(aarch64_validate_mtune): Ditto.
|
|
Also add a (currently unused) processor field to aarch64_processor_info,
and change name from "" to NULL for the terminating array entries.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(struct aarch64_option_extension): Rename to..
(struct aarch64_extension_info): ...this.
(all_extensions): Update type name.
(struct arch_to_arch_name): Rename to...
(struct aarch64_arch_info): ...this, and rename name field.
(all_architectures): Update type names, and move before...
(struct processor_name_to_arch): ...this. Rename to...
(struct aarch64_processor_info): ...this, rename name field and
add cpu field.
(all_cores): Update type name, and set new field.
(aarch64_parse_extension): Update names.
(aarch64_get_all_extension_candidates): Ditto.
(aarch64_rewrite_selected_cpu): Ditto.
|
|
The list of cores in aarch64-common.cc included an explicit "generic"
entry, despite this entry also being present in aarch64-cores.def.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(all_cores): Remove explicit generic entry.
|
|
gcc/ChangeLog:
* common/config/s390/s390-common.cc: Add arch15 processor flags.
* config.gcc: Add arch15 for options --with-{arch,mtune}.
* config/s390/driver-native.cc (s390_host_detect_local_cpu):
Default to arch15.
* config/s390/s390-opts.h (enum processor_type): Add
PROCESSOR_ARCH15.
* config/s390/s390.cc (processor_table,s390_issue_rate,
s390_get_sched_attrmask,s390_get_unit_mask): Add arch15.
* config/s390/s390.h (enum processor_flags): Add processor flags
for VXE3 and ARCH15.
(TARGET_CPU_VXE3): Define.
(TARGET_CPU_VXE3_P): Define.
(TARGET_CPU_ARCH15): Define.
(TARGET_CPU_ARCH15_P): Define.
(TARGET_VXE3): Define.
(TARGET_VXE3_P): Define.
(TARGET_ARCH15): Define.
(TARGET_ARCH15_P): Define.
* config/s390/s390.md: Add VXE3 and ARCH15 to cpu_facility, and
let attribute "enabled" deal with them.
* config/s390/s390.opt: Add arch15.
gcc/testsuite/ChangeLog:
* gcc.target/s390/s390.exp: Set compiler flags for the vxe3
subdirectory of the testsuite as done e.g. for vxe2.
|
|
This patch only support landing pad value is 0.
The next version will implement function signature based labeling
scheme.
RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFILP ISA
string.
* config.gcc: Add riscv-zicfilp.o
* config/riscv/riscv-passes.def (INSERT_PASS_BEFORE):
Insert landing pad instructions.
* config/riscv/riscv-protos.h (make_pass_insert_landing_pad):
Declare.
* config/riscv/riscv-zicfilp.cc: New file.
* config/riscv/riscv.cc
(riscv_trampoline_init): Add landing pad instructions.
(riscv_legitimize_call_address): Likewise.
(riscv_output_mi_thunk): Likewise.
* config/riscv/riscv.h: Update.
* config/riscv/riscv.md: Add landing pad patterns.
* config/riscv/riscv.opt (TARGET_ZICFILP): Define.
* config/riscv/t-riscv: Add build rule for
riscv-zicfilp.o
gcc/testsuite/ChangeLog:
* gcc.target/riscv/interrupt-no-lpad.c: New test.
* gcc.target/riscv/zicfilp-call.c: New test.
Co-Developed-by: Greg McGary <gkm@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>
|
|
This patch is implemented according to the RISC-V CFI specification.
It supports the generation of shadow stack instructions in the prologue,
epilogue, non-local gotos, and unwinding.
RISC-V CFI SPEC: https://github.com/riscv/riscv-cfi
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add ZICFISS ISA string.
* config/riscv/predicates.md: New predicate x1x5_operand.
* config/riscv/riscv.cc
(riscv_expand_prologue): Insert shadow stack instructions.
(riscv_expand_epilogue): Likewise.
(riscv_for_each_saved_reg): Assign t0 or ra register for
sspopchk instruction.
(need_shadow_stack_push_pop_p): New function. Omit shadow
stack operation on leaf function.
* config/riscv/riscv.h
(need_shadow_stack_push_pop_p): Define.
* config/riscv/riscv.md: Add shadow stack patterns.
(save_stack_nonlocal): Add shadow stack instructions for setjump.
(restore_stack_nonlocal): Add shadow stack instructions for longjump.
* config/riscv/riscv.opt (TARGET_ZICFISS): Define.
libgcc/ChangeLog:
* config/riscv/linux-unwind.h: Include shadow-stack-unwind.h.
* config/riscv/shadow-stack-unwind.h
(_Unwind_Frames_Extra): Define.
(_Unwind_Frames_Increment): Define.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/ssp-1.c: New test.
* gcc.target/riscv/ssp-2.c: New test.
Co-Developed-by: Greg McGary <gkm@rivosinc.com>,
Kito Cheng <kito.cheng@gmail.com>
|
|
In ISE, The model number for Diamond Rapids is 13_01H.
Remove 0x00 since it is unused.
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_intel_cpu): Remove 0x00.
|
|
The early scheduler takes up ~33% of the total build time, however it doesn't
provide a meaningful performance gain. This is partly because modern OoO cores
need far less scheduling, partly because the scheduler tends to create many
unnecessary spills by increasing register pressure. Building applications
56% faster is far more useful than ~0.1% improvement on SPEC, so switch off
early scheduling on AArch64. Codesize reduces by ~0.2%.
Fix various tests that depend on scheduling by explicitly adding -fschedule-insns.
gcc:
* common/config/aarch64/aarch64-common.cc: Switch off fschedule_insns.
gcc/testsuite:
* gcc.dg/guality/pr36728-3.c: Remove XFAIL.
* gcc.dg/guality/pr68860-1.c: Likewise.
* gcc.dg/guality/pr68860-2.c: Likewise.
* gcc.target/aarch64/ldp_aligned.c: Fix test.
* gcc.target/aarch64/ldp_always.c: Likewise.
* gcc.target/aarch64/ldp_stp_10.c: Add -fschedule-insns.
* gcc.target/aarch64/ldp_stp_12.c: Likewise.
* gcc.target/aarch64/ldp_stp_13.c: Remove test.
* gcc.target/aarch64/ldp_stp_21.c: Add -fschedule-insns.
* gcc.target/aarch64/ldp_stp_8.c: Likewise.
* gcc.target/aarch64/ldp_vec_v2sf.c: Likewise.
* gcc.target/aarch64/ldp_vec_v2si.c: Likewise.
* gcc.target/aarch64/test_frame_16.c: Fix test.
* gcc.target/aarch64/sve/vcond_12.c: Add -fschedule-insns.
* gcc.target/aarch64/sve/acle/general/ldff1_3.c: Likewise.
|