Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
This patch adds cde feature (optional) support for Cortex-M55 CPU, please refer
[1] for more details. To use this feature we need to specify +cdecpN
(e.g. -mcpu=cortex-m55+cdecp<N>), where N is the coprocessor number 0 to 7.
gcc/ChangeLog:
2023-01-13 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* common/config/arm/arm-common.cc (arm_canon_arch_option_1): Ignore cde
options for -mlibarch.
* config/arm/arm-cpus.in (begin cpu cortex-m55): Add cde options.
* doc/invoke.texi (CDE): Document options for Cortex-M55 CPU.
gcc/testsuite/ChangeLog:
2023-01-13 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* gcc.target/arm/multilib.exp: Add multilib tests for Cortex-M55 CPU.
|
|
This adds znver4 automata units and reservations separately from other
znver automata, avoiding the insn-automata.cc size blow-up.
gcc/ChangeLog:
* common/config/i386/i386-common.cc (processor_alias_table):
Use CPU_ZNVER4 for znver4.
* config/i386/i386.md: Add znver4.md.
* config/i386/znver4.md: New.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_intel_cpu): Handle Emeraldrapids.
* common/config/i386/i386-common.cc: Add Emeraldrapids.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_intel_cpu): Remove case 0xb5
for meteorlake.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc:
|
|
PR106680 shows that -m32 -mpowerpc64 is different from
-mpowerpc64 -m32, this is determined by the way how we
handle option powerpc64 in rs6000_handle_option.
Segher pointed out this difference should be taken as
a bug and we should ensure that option powerpc64 is
independent of -m32/-m64. So this patch removes the
handlings in rs6000_handle_option and add some necessary
supports in rs6000_option_override_internal instead.
With this patch, if users specify -m{no-,}powerpc64, the
specified value is honoured, otherwise, for 64bit it
always enables OPTION_MASK_POWERPC64; while for 32bit
and TARGET_POWERPC64 and OS_MISSING_POWERPC64, it disables
OPTION_MASK_POWERPC64.
btw, following Segher's suggestion, I did some tries to warn
when OPTION_MASK_POWERPC64 is set for OS_MISSING_POWERPC64.
If warn for the case that powerpc64 is specified explicitly,
there are some TCs using -m32 -mpowerpc64 on ppc64-linux,
they need some updates, meanwhile the artificial run
with "--target_board=unix'{-m32/-mpowerpc64}'" will have
noisy warnings on ppc64-linux. If warn for the case that
it's specified implicitly, they can just be initialized by
TARGET_DEFAULT (like -m32 on ppc64-linux) or set from the
given cpu mask, we have to special case them and not to warn.
As Segher's latest comment, I decide not to warn them and
keep it consistent with before.
Bootstrapped and regress-tested on:
- powerpc64-linux-gnu P7 and P8 {-m64,-m32}
- powerpc64le-linux-gnu P9 and P10
- powerpc-ibm-aix7.2.0.0 {-maix64,-maix32}
- powerpc-darwin9 (with Iain's help)
PR target/106680
gcc/ChangeLog:
* common/config/rs6000/rs6000-common.cc (rs6000_handle_option): Remove
the adjustment for option powerpc64 in -m64 handling, and remove the
whole -m32 handling.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): When no
explicit powerpc64 option is provided, enable it for -m64. For 32 bit
and OS_MISSING_POWERPC64, disable powerpc64 if it's enabled but not
specified explicitly.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106680-1.c: New test.
* gcc.target/powerpc/pr106680-2.c: New test.
* gcc.target/powerpc/pr106680-3.c: New test.
* gcc.target/powerpc/pr106680-4.c: New test.
2022-12-27 Kewen Lin <linkw@linux.ibm.com>
Iain Sandoe <iain@sandoe.co.uk>
|
|
Followed by the discussion in pr107692, -munroll-only-small-loops
Does not turns on/off -funroll-loops, and current check in
pass_rtl_unroll_loops::gate would cause -fno-unroll-loops do not take
effect. Revert the change about targetm.loop_unroll_adjust and apply
the backend option change to strictly follow the rule that
-funroll-loops takes full control of loop unrolling, and
munroll-only-small-loops just change its behavior to unroll small size
loops.
gcc/ChangeLog:
PR target/107692
* common/config/i386/i386-common.cc (ix86_optimization_table):
Enable loop unroll O2, disable -fweb and -frename-registers
by default.
* config/i386/i386-options.cc
(ix86_override_options_after_change):
Disable small loop unroll when funroll-loops enabled, reset
cunroll_grow_size when it is not explicitly enabled.
(ix86_option_override_internal): Call
ix86_override_options_after_change instead of calling
ix86_recompute_optlev_based_flags and ix86_default_align
separately.
* config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
factor if -munroll-only-small-loops enabled.
* loop-init.cc (pass_rtl_unroll_loops::gate): Do not enable
loop unrolling for -O2-speed.
(pass_rtl_unroll_loops::execute): Rmove
targetm.loop_unroll_adjust check.
gcc/testsuite/ChangeLog:
PR target/107692
* gcc.dg/guality/loop-1.c: Remove additional option for ia32.
* gcc.target/i386/pr86270.c: Add -fno-unroll-loops.
* gcc.target/i386/pr93002.c: Likewise.
|
|
This reverts commit c8874c5e8a7cee2933923c40f4933602da2022fb.
|
|
gcc/ChangeLog:
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AMX_INT8_SET): Add AMX-TILE dependency.
(OPTION_MASK_ISA2_AMX_BF16_SET): Ditto.
(OPTION_MASK_ISA2_AMX_FP16_SET): Ditto.
(OPTION_MASK_ISA2_AMX_TILE_UNSET): Disable AMX_{INT8,
BF16, FP16} when disable AMX_TILE.
gcc/testsuite/ChangeLog:
* gcc.target/i386/amxbf16-dpbf16ps-2.c: Remove -amx-tile.
* gcc.target/i386/amxfp16-dpfp16ps-2.c: Ditto.
* gcc.target/i386/amxint8-dpbssd-2.c: Ditto.
* gcc.target/i386/amxint8-dpbsud-2.c: Ditto.
* gcc.target/i386/amxint8-dpbusd-2.c: Ditto.
* gcc.target/i386/amxint8-dpbuud-2.c: Ditto.
|
|
Modern processors has multiple way instruction decoders
For x86, icelake/zen3 has 5 uops, so for small loop with <= 4
instructions (usually has 3 uops with a cmp/jmp pair that can be
macro-fused), the decoder would have 2 uops bubble for each iteration
and the pipeline could not be fully utilized.
Therefore, this patch enables loop unrolling for small size loop at O2
to fullfill the decoder as much as possible. It turns on rtl loop
unrolling when targetm.loop_unroll_adjust exists and O2 plus speed only.
In x86 backend the default behavior is to unroll small loops with less
than 4 insns by 1 time.
This improves 548.exchange2 by 9% on icelake and 7.4% on zen3 with
0.9% codesize increment. For other benchmarks the variants are minor
and overall codesize increased by 0.2%.
The kernel image size increased by 0.06%, and no impact on eembc.
gcc/ChangeLog:
* common/config/i386/i386-common.cc (ix86_optimization_table):
Enable small loop unroll at O2 by default.
* config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
factor if -munroll-only-small-loops enabled and -funroll-loops/
-funroll-all-loops are disabled.
* config/i386/i386.h (struct processor_costs): Add 2 field
small_unroll_ninsns and small_unroll_factor.
* config/i386/i386.opt: Add -munroll-only-small-loops.
* doc/gcc/gcc-command-options/machine-dependent-options/x86-options.rst:
Document -munroll-only-small-loops.
* doc/gcc/gcc-command-options/option-summary.rst: Likewise.
* loop-init.cc (pass_rtl_unroll_loops::gate): Enable rtl
loop unrolling for -O2-speed and above if target hook
loop_unroll_adjust exists.
(pass_rtl_unroll_loops::execute): Set UAP_UNROLL flag
when target hook loop_unroll_adjust exists.
* config/i386/x86-tune-costs.h: Update all processor costs
with small_unroll_ninsns = 4 and small_unroll_factor = 2.
gcc/testsuite/ChangeLog:
* gcc.dg/guality/loop-1.c: Add additional option
-mno-unroll-only-small-loops.
* gcc.target/i386/pr86270.c: Add -mno-unroll-only-small-loops.
* gcc.target/i386/pr93002.c: Likewise.
|
|
gcc/c-family/ChangeLog:
* c-target.def: Port to RST.
gcc/ChangeLog:
* common/common-target.def: Port to RST.
* target.def: Port to RST.
gcc/d/ChangeLog:
* d-target.def: Port to RST.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_intel_cpu): Handle Grand Ridge.
* common/config/i386/i386-common.cc
(processor_names): Add grandridge.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h:
(enum processor_types): Add INTEL_GRANDRIDGE.
* config.gcc: Add -march=grandridge.
* config/i386/driver-i386.cc (host_detect_local_cpu):
Handle grandridge.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Ditto.
* config/i386/i386-options.cc (m_GRANDRIDGE): New define.
(processor_cost_table): Add grandridge.
* config/i386/i386.h (enum processor_type):
Add PROCESSOR_GRANDRIDGE.
(PTA_GRANDRIDGE): Ditto.
* doc/extend.texi: Add grandridge.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv16.C: Add grandridge.
* gcc.target/i386/funcspec-56.inc: Handle new march.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detect raoint.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_RAOINT_SET,
OPTION_MASK_ISA2_RAOINT_UNSET): New.
(ix86_handle_option): Handle -mraoint.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_RAOINT.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
raoint.
* config.gcc: Add raointintrin.h
* config/i386/cpuid.h (bit_RAOINT): New.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__RAOINT__.
* config/i386/i386-isa.def (RAOINT): Add DEF_PTA(RAOINT).
* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
Add -mraoint.
* config/i386/sync.md (rao_a<raointop><mode>): New define insn.
* config/i386/i386.opt: Add option -mraoint.
* config/i386/x86gprintrin.h: Include raointintrin.h.
* doc/extend.texi: Document raoint.
* doc/invoke.texi: Document -mraoint.
* doc/sourcebuild.texi: Document target raoint.
* config/i386/raointintrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -mraoint.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -mraoint.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add raoint target.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp: Add check_effective_target_raoint.
* gcc.target/i386/rao-helper.h: New test.
* gcc.target/i386/raoint-1.c: Ditto.
* gcc.target/i386/raoint-aadd-2.c: Ditto.
* gcc.target/i386/raoint-aand-2.c: Ditto.
* gcc.target/i386/raoint-aor-2.c: Ditto.
* gcc.target/i386/raoint-axor-2.c: Ditto.
* gcc.target/i386/x86gprintrin-1.c: Ditto.
* gcc.target/i386/x86gprintrin-2.c: Ditto.
* gcc.target/i386/x86gprintrin-3.c: Ditto.
* gcc.target/i386/x86gprintrin-4.c: Ditto.
* gcc.target/i386/x86gprintrin-5.c: Ditto.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_intel_cpu): Handle Granite Rapids.
* common/config/i386/i386-common.cc:
(processor_names): Add graniterapids.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h
(enum processor_subtypes): Add INTEL_GRANTIERAPIDS.
* config.gcc: Add -march=graniterapids.
* config/i386/driver-i386.cc (host_detect_local_cpu):
Handle graniterapids.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Ditto.
* config/i386/i386-options.cc (m_GRANITERAPIDS): New.
(processor_cost_table): Add graniterapids.
* config/i386/i386.h (enum processor_type):
Add PROCESSOR_GRANITERAPIDS.
(PTA_GRANITERAPIDS): Ditto.
* doc/extend.texi: Add graniterapids.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv16.C: Add graniterapids.
* gcc.target/i386/funcspec-56.inc: Handle new march.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detect PREFETCHI.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_PREFETCHI_SET,
OPTION_MASK_ISA2_PREFETCHI_UNSET): New.
(ix86_handle_option): Handle -mprefetchi.
* common/config/i386/i386-cpuinfo.h
(enum processor_features): Add FEATURE_PREFETCHI.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY
for prefetchi.
* config.gcc: Add prfchiintrin.h.
* config/i386/cpuid.h (bit_PREFETCHI): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (VOID, PCVOID, INT)
and DEF_FUNCTION_TYPE (VOID, PCVOID, INT, INT, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Define __PREFETCHI__.
* config/i386/i386-expand.cc: Handle new builtins.
* config/i386/i386-isa.def (PREFETCHI):
Add DEF_PTA(PREFETCHI).
* config/i386/i386-options.cc
(ix86_valid_target_attribute_inner_p): Handle prefetchi.
* config/i386/i386.md (prefetchi): New define_insn.
* config/i386/i386.opt: Add option -mprefetchi.
* config/i386/predicates.md (local_func_symbolic_operand):
New predicates.
* config/i386/x86gprintrin.h: Include prfchiintrin.h.
* config/i386/xmmintrin.h (enum _mm_hint): New enum for
prefetchi.
(_mm_prefetch): Handle the highest bit of enum.
* doc/extend.texi: Document prefetchi.
* doc/invoke.texi: Document -mprefetchi.
* doc/sourcebuild.texi: Document target prefetchi.
* config/i386/prfchiintrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -mprefetchi.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-1.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-13.c: Add -mprefetchi.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/x86gprintrin-1.c: Ditto.
* gcc.target/i386/x86gprintrin-2.c: Ditto.
* gcc.target/i386/x86gprintrin-3.c: Ditto.
* gcc.target/i386/x86gprintrin-4.c: Ditto.
* gcc.target/i386/x86gprintrin-5.c: Ditto.
* gcc.target/i386/prefetchi-1.c: New test.
* gcc.target/i386/prefetchi-2.c: Ditto.
* gcc.target/i386/prefetchi-3.c: Ditto.
* gcc.target/i386/prefetchi-4.c: Ditto.
Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features): Detect
amx-fp16.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AMX_FP16_SET,
OPTION_MASK_ISA2_AMX_FP16_UNSET): New macros.
(ix86_handle_option): Handle -mamx-fp16.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AMX_FP16.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
amx-fp16.
* config.gcc: Add amxfp16intrin.h.
* config/i386/cpuid.h (bit_AMX_FP16): New.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AMX_FP16__.
* config/i386/i386-isa.def: Add DEF_PTA for AMX_FP16.
* config/i386/i386-options.cc (isa2_opts): Add -mamx-fp16.
(ix86_valid_target_attribute_inner_p): Add new ATTR.
(ix86_option_override_internal): Handle AMX-FP16.
* config/i386/i386.opt: Add -mamx-fp16.
* config/i386/immintrin.h: Include amxfp16intrin.h.
* doc/extend.texi: Document -mamx-fp16.
* doc/invoke.texi: Document amx-fp16.
* doc/sourcebuild.texi: Document amx_fp16.
* config/i386/amxfp16intrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -mamx-fp16.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/sse-12.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp: (check_effective_target_amx_fp16):
New proc.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/amx-check.h: Add AMX_FP16.
* gcc.target/i386/amx-helper.h: New file to support amx-fp16.
* gcc.target/i386/amxfp16-asmatt-1.c: New test.
* gcc.target/i386/amxfp16-asmintel-1.c: Ditto.
* gcc.target/i386/amxfp16-dpfp16ps-2.c: Ditto.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_intel_cpu):
Add Sierra Forest.
* common/config/i386/i386-common.cc
(processor_names): Add Sierra Forest.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h
(enum processor_types): Add INTEL_SIERRAFOREST.
* config.gcc: Add -march=sierraforest.
* config/i386/driver-i386.cc (host_detect_local_cpu):
Handle Sierra Forest.
* config/i386/i386-c.cc (ix86_target_macros_internal):
Ditto.
* config/i386/i386-options.cc (m_SIERRAFOREST): New define.
(processor_cost_table): Add sierra forest.
* config/i386/i386.h (enum processor_type):
Add PROCESSOR_SIERRA_FOREST.
(PTA_SIERRAFOREST): Ditto.
* doc/extend.texi: Add sierra forest.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv16.C: Add sierra forest.
* gcc.target/i386/funcspec-56.inc: Handle new march.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Detect cmpccxadd.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_CMPCCXADD_SET,
OPTION_MASK_ISA2_CMPCCXADD_UNSET): New.
(ix86_handle_option): Handle -mcmpccxadd.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_CMPCCXADD.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
cmpccxadd.
* config.gcc: Add cmpccxaddintrin.h.
* config/i386/cpuid.h (bit_CMPCCXADD): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE(INT, PINT, INT, INT, INT)
and DEF_FUNCTION_TYPE(LONGLONG, PLONGLONG, LONGLONG, LONGLONG, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__CMPCCXADD__.
* config/i386/i386-expand.cc (ix86_expand_special_args_builtin):
Add new parameter to indicate constant position.
Handle INT_FTYPE_PINT_INT_INT_INT
and LONGLONG_FTYPE_PLONGLONG_LONGLONG_LONGLONG_INT.
* config/i386/i386-isa.def (CMPCCXADD): Add DEF_PTA(CMPCCXADD).
* config/i386/i386-options.cc (isa2_opts): Add -mcmpccxadd.
(ix86_valid_target_attribute_inner_p): Handle cmpccxadd.
* config/i386/i386.opt: Add option -mcmpccxadd.
* config/i386/sync.md (cmpccxadd_<mode>): New define insn.
* config/i386/x86gprintrin.h: Include cmpccxaddintrin.h.
* doc/extend.texi: Document cmpccxadd.
* doc/invoke.texi: Document -mcmpccxadd.
* doc/sourcebuild.texi: Document target cmpccxadd.
* config/i386/cmpccxaddintrin.h: New file.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Add -mcmpccxadd.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-1.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-13.c: Add -mcmpccxadd.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/x86gprintrin-1.c: Ditto.
* gcc.target/i386/x86gprintrin-2.c: Ditto.
* gcc.target/i386/x86gprintrin-3.c: Ditto.
* gcc.target/i386/x86gprintrin-4.c: Ditto.
* gcc.target/i386/x86gprintrin-5.c: Ditto.
* lib/target-supports.exp (check_effective_target_cmpccxadd):
New.
* gcc.target/i386/cmpccxadd-1.c: New test.
* gcc.target/i386/cmpccxadd-2.c: Ditto.
|
|
This patch adds support for the Zawrs ISA extension.
Zawrs has been ratified by the RISC-V BoD on Oct 20th, 2022.
Binutils support has been merged as:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=eb668e50036e979fb0a74821df4eee0307b44e66
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add zawrs extension.
* config/riscv/riscv-opts.h (MASK_ZAWRS): New.
(TARGET_ZAWRS): New.
* config/riscv/riscv.opt: New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zawrs.c: New test.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
gcc/ChangeLog:
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXNECONVERT_SET,
OPTION_MASK_ISA2_AVXNECONVERT_UNSET): New.
(ix86_handle_option): Handle -mavxneconvert, unset
avxneconvert when avx2 is disabled.
* common/config/i386/i386-cpuinfo.h (processor_types): Add
FEATURE_AVXNECONVERT.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxneconvert.
* common/config/i386/cpuinfo.h (get_available_features):
Detect avxneconvert.
* config.gcc: Add avxneconvertintrin.h
* config/i386/avxneconvertintrin.h: New.
* config/i386/avx512bf16vlintrin.h (_mm256_cvtneps_pbh):
Unified builtin with avxneconvert.
(_mm_cvtneps_pbh): Ditto.
* config/i386/cpuid.h (bit_AVXNECONVERT): New.
* config/i386/i386-builtin-types.def: Add
DEF_POINTER_TYPE (PCV8HF, V8HF, CONST),
DEF_POINTER_TYPE (PCV8BF, V8BF, CONST),
DEF_POINTER_TYPE (PCV16HF, V16HF, CONST),
DEF_POINTER_TYPE (PCV16BF, V16BF, CONST),
DEF_FUNCTION_TYPE (V4SF, PCBFLOAT16),
DEF_FUNCTION_TYPE (V4SF, PCFLOAT16),
DEF_FUNCTION_TYPE (V8SF, PCBFLOAT16),
DEF_FUNCTION_TYPE (V8SF, PCFLOAT16),
DEF_FUNCTION_TYPE (V4SF, PCV8BF),
DEF_FUNCTION_TYPE (V4SF, PCV8HF),
DEF_FUNCTION_TYPE (V8SF, PCV16HF),
DEF_FUNCTION_TYPE (V8SF, PCV16BF),
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXNECONVERT__.
* config/i386/i386-expand.cc (ix86_expand_special_args_builtin):
Handle V4SF_FTYPE_PCBFLOAT16,V8SF_FTYPE_PCBFLOAT16, V4SF_FTYPE_PCFLOAT16,
V8SF_FTYPE_PCFLOAT16,V4SF_FTYPE_PCV8BF,
V4SF_FTYPE_PCV8HF,V8SF_FTYPE_PCV16BF,V8SF_FTYPE_PCV16HF.
* config/i386/i386-isa.def : Add DEF_PTA(AVXNECONVERT) New.
* config/i386/i386-options.cc (isa2_opts): Add -mavxneconvert.
(ix86_valid_target_attribute_inner_p): Handle avxneconvert.
* config/i386/i386.md: Add attr avx512bf16vl and avxneconvert.
* config/i386/i386.opt: Add option -mavxneconvert.
* config/i386/immintrin.h: Inculde avxneconvertintrin.h.
* config/i386/sse.md (vbcstnebf162ps_<mode>): New define_insn.
(vbcstnesh2ps_<mode>): Ditto.
(vcvtnee<bf16_ph>2ps_<mode>):Ditto.
(vcvtneo<bf16_ph>2ps_<mode>):Ditto.
(vcvtneps2bf16_v4sf): Ditto.
(*vcvtneps2bf16_v4sf): Ditto.
(vcvtneps2bf16_v8sf): Ditto.
* doc/invoke.texi: Document -mavxneconvert.
* doc/extend.texi: Document avxneconvert.
* doc/sourcebuild.texi: Document target avxneconvert.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-check.h: Add avxneconvert check.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -mavxneconvert.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.
* lib/target-supports.exp:add check_effective_target_avxneconvert.
* gcc.target/i386/avx-ne-convert-1.c: New test.
* gcc.target/i386/avx-ne-convert-vbcstnebf162ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vbcstnesh2ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vcvtneebf162ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vcvtneeph2ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vcvtneobf162ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vcvtneoph2ps-2.c: Ditto.
* gcc.target/i386/avx-ne-convert-vcvtneps2bf16-2.c: Ditto.
* gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1.c: Rename..
* gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: To this.
* gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: New test.
|
|
Minimal support of z*inx extension, include 'zfinx', 'zdinx' and 'zhinx/zhinxmin'
corresponding to 'f', 'd' and 'zfh/zfhmin', the 'zdinx' will imply 'zfinx'
same as 'd' imply 'f', 'zhinx' will aslo imply 'zfinx', all zfinx extension imply 'zicsr'.
Co-Authored-By: Sinan Lin <sinan@isrc.iscas.ac.cn>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: New extensions.
* config/riscv/arch-canonicalize: New imply relations.
* config/riscv/riscv-opts.h (MASK_ZFINX): New mask.
(MASK_ZDINX): Ditto.
(MASK_ZHINX): Ditto.
(MASK_ZHINXMIN): Ditto.
(TARGET_ZFINX): New target.
(TARGET_ZDINX): Ditto.
(TARGET_ZHINX): Ditto.
(TARGET_ZHINXMIN): Ditto.
* config/riscv/riscv.opt: New target variable.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (has_cpu_feature): Add comment.
(reset_cpu_feature): New.
(get_zhaoxin_cpu): Use reset_cpu_feature.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
Add svinval and svnapot extension.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv-opts.h (MASK_SVINVAL): New.
(MASK_SVNAPOT): Ditto.
(TARGET_SVINVAL): Ditto.
(TARGET_SVNAPOT): Ditto.
* config/riscv/riscv.opt (riscv_sv_subext): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-24.c:New.
* gcc.target/riscv/predef-25.c:New.
|
|
`h` was the prefix of multi-letter extension name, but it become a
extension in later RISC-V isa spec.
Fortunately we don't have any extension really defined is prefixed
with `h`, so we can just change that.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_ext_version_table):
Add `h`.
(riscv_supported_std_ext): Ditto.
(multi_letter_subset_rank): Remove `h`.
(riscv_subset_list::parse_std_ext): Handle `h` as single letter
extension.
(riscv_subset_list::parse): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-18.c: New.
* gcc.target/riscv/arch-5.c: Remove test for prefixed
with `h`.
* gcc.target/riscv/predef-23.c: New.
|
|
This reverts the changes made to znver.md in:
commit bf3b532b524ecacb3202ab2c8af419ffaaab7cff
2022-10-21 Tejas Joshi <TejasSanjay.Joshi@amd.com>
gcc/ChangeLog:
* common/config/i386/i386-common.cc (processor_alias_table): Use
CPU_ZNVER3 for znver4.
* config/i386/znver.md: Remove znver4 reservations.
|
|
Move riscv_get_valid_option_values out of
Fixes:
riscv/riscv-common.cc:1748:40: error: ‘riscv_get_valid_option_values’ was not declared in this scope
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_get_valid_option_values): Get out of ifdef.
|
|
PR target/107364
gcc/ChangeLog:
* common/config/i386/i386-cpuinfo.h (enum processor_vendor):
Fix pedantic warning.
|
|
PR target/107364
gcc/ChangeLog:
* common/config/i386/i386-cpuinfo.h (enum processor_vendor):
Reorder enum values as BUILTIN_VENDOR_MAX should not point
in the middle of the valid enum values.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (riscv_tunes): New.
(riscv_get_valid_option_values): New.
(TARGET_GET_VALID_OPTION_VALUES): New.
* config/riscv/riscv-cores.def (RISCV_TUNE): New, define options
for tune here.
(RISCV_CORE): Fix comment.
* config/riscv/riscv.cc (riscv_tune_info_table): Move definition to
riscv-cores.def.
|
|
2022-09-28 Tejas Joshi <TejasSanjay.Joshi@amd.com>
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver4.
* common/config/i386/i386-common.cc (processor_names): Add znver4.
(processor_alias_table): Add znver4 and modularize old znvers.
* common/config/i386/i386-cpuinfo.h (processor_subtypes):
AMDFAM19H_ZNVER4.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
-march=native recognize znver4 cpus.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add znver4.
* config/i386/i386-options.cc (m_ZNVER4): New definition.
(m_ZNVER): Include m_ZNVER4.
(processor_cost_table): Add znver4.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER4.
(PTA_ZNVER1): New definition.
(PTA_ZNVER2): Likewise.
(PTA_ZNVER3): Likewise.
(PTA_ZNVER4): Likewise.
* config/i386/i386.md (define_attr "cpu"): Add znver4 and rename
md file.
* config/i386/x86-tune-costs.h (znver4_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver4.
(ix86_adjust_cost): Likewise.
* config/i386/znver1.md: Rename to znver.md.
* config/i386/znver.md: Add new reservations for znver4.
* doc/extend.texi: Add details about znver4.
* doc/invoke.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.target/i386/mv29.C: Likewise.
|
|
gcc/ChangeLog
* common/config/i386/cpuinfo.h (get_available_features): Detect
avxvnniint8.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXVNNIINT8_SET): New.
(OPTION_MASK_ISA2_AVXVNNIINT8_UNSET): Ditto.
(ix86_handle_option): Handle -mavxvnniint8.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVXVNNIINT8.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxvnniint8.
* config.gcc: Add avxvnniint8intrin.h.
* config/i386/avxvnniint8intrin.h: New file.
* config/i386/cpuid.h (bit_AVXVNNIINT8): New.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXVNNIINT8__.
* config/i386/i386-options.cc (isa2_opts): Add -mavxvnniint8.
(ix86_valid_target_attribute_inner_p): Handle avxvnniint8.
* config/i386/i386-isa.def: Add DEF_PTA(AVXVNNIINT8) New..
* config/i386/i386.opt: Add option -mavxvnniint8.
* config/i386/immintrin.h: Include avxvnniint8intrin.h.
* config/i386/sse.md (UNSPEC_VPMADDUBSWACCD
UNSPEC_VPMADDUBSWACCSSD,UNSPEC_VPMADDWDACCD,
UNSPEC_VPMADDWDACCSSD): Rename according to new style.
(vpdp<vpdotprodtype>_<mode>): New define_insn.
* doc/extend.texi: Document avxvnniint8.
* doc/invoke.texi: Document -mavxvnniint8.
* doc/sourcebuild.texi: Document target avxvnniint8.
gcc/testsuite/ChangeLog
* g++.dg/other/i386-2.C: Add -mavxvnniint8.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-check.h: Add avxvnniint8 check.
* gcc.target/i386/sse-12.c: Add -mavxvnniint8.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxvnniint8): New.
* gcc.target/i386/avxvnniint8-1.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avxvnniint8-vpdpbuuds-2.c: Ditto.
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA_AVXIFMA_SET, OPTION_MASK_ISA2_AVXIFMA_UNSET,
OPTION_MASK_ISA2_AVX2_UNSET): New macro.
(ix86_handle_option): Handle -mavxifma.
* common/config/i386/i386-cpuinfo.h (processor_types): Add
FEATURE_AVXIFMA.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxifma.
* common/config/i386/cpuinfo.h (get_available_features):
Detect avxifma.
* config.gcc: Add avxifmaintrin.h
* config/i386/avx512ifmavlintrin.h: (_mm_madd52lo_epu64): Change
to macro.
(_mm_madd52hi_epu64): Likewise.
(_mm256_madd52lo_epu64): Likewise.
(_mm256_madd52hi_epu64): Likewise.
* config/i386/avxifmaintrin.h: New header.
* config/i386/cpuid.h (bit_AVXIFMA): New.
* config/i386/i386-builtin.def: Add new builtins, and correct
pattern names for AVX512IFMA.
* config/i386/i386-builtins.cc (def_builtin): Handle AVX-IFMA
builtins like AVX-VNNI.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXIFMA__.
* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
Relax ISA masks for AVXIFMA.
* config/i386/i386-isa.def: Add AVXIFMA.
* config/i386/i386-options.cc (isa2_opts): Add -mavxifma.
(ix86_valid_target_attribute_inner_p): Handle avxifma.
* config/i386/i386.md (isa): Add attr avxifma and avxifmavl.
* config/i386/i386.opt: Add option -mavxifma.
* config/i386/immintrin.h: Inculde avxifmaintrin.h.
* config/i386/sse.md (avx_vpmadd52<vpmadd52type>_<mode>):
Remove.
(vpamdd52<vpmadd52type><mode><sd_maskz_name>): Remove.
(vpamdd52huq<mode>_maskz): Rename to ...
(vpmadd52huq<mode>_maskz): ... this.
(vpamdd52luq<mode>_maskz): Rename to ...
(vpmadd52luq<mode>_maskz): ... this.
(vpmadd52<vpmadd52type><mode>): New define_insn.
(vpmadd52<vpmadd52type>v8di): Likewise.
(vpmadd52<vpmadd52type><mode>_maskz_1): Likewise.
(vpamdd52<vpmadd52type><mode>_mask): Rename to ...
(vpmadd52<vpmadd52type><mode>_mask): ... this.
* doc/invoke.texi: Document -mavxifma.
* doc/extend.texi: Document avxifma.
* doc/sourcebuild.texi: Document target avxifma.
gcc/testsuite/
* gcc.target/i386/avx-check.h: Add avxifma check.
* gcc.target/i386/avx512ifma-vpmaddhuq-1.c: Remane..
* gcc.target/i386/avx512ifma-vpmaddhuq-1a.c: To this.
* gcc.target/i386/avx512ifma-vpmaddluq-1.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddluq-1a.c: Ditto.
* gcc.target/i386/avx512ifma-vpmaddhuq-1b.c: New Test.
* gcc.target/i386/avx512ifma-vpmaddluq-1b.c: Ditto.
* gcc.target/i386/avx-ifma-1.c: Ditto.
* gcc.target/i386/avx-ifma-2.c: Ditto.
* gcc.target/i386/avx-ifma-3.c: Ditto.
* gcc.target/i386/avx-ifma-4.c: Ditto.
* gcc.target/i386/avx-ifma-5.c: Ditto.
* gcc.target/i386/avx-ifma-6.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddhuq-2.c: Ditto.
* gcc.target/i386/avx-ifma-vpmaddluq-2.c: Ditto.
* gcc.target/i386/sse-12.c: Add -mavxifma.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxifma): New.
|
|
I was looking at H8 assembly code recently and noticed we had unnecessary
extensions. As it turns out we never enabled redundant extension elimination
on the H8. This patch fixes that oversight (and was the trigger for the
failure fixed my the prior patch).
gcc/common
* common/config/h8300/h8300-common.cc (h8300_option_optimization_table):
Enable redundant extension elimination at -O2 and above.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h:
(get_intel_cpu): Handle Meteorlake.
* common/config/i386/i386-common.cc:
(processor_alias_table): Add Meteorlake.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h:
(get_intel_cpu): Handle Raptorlake.
* common/config/i386/i386-common.cc:
(processor_alias_table): Add Raptorlake.
|
|
gcc/
* common/config/arc/arc-common.cc (arc_option_optimization_table):
Remove Rcq and Rcw options.
* config/arc/arc.opt (mRcq): Ignore option, preserve it for
backwards compatibility.
(mRcw): Likewise.
* doc/invoke.texi (mRcw, mRcq): Update document.
Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
|
|
-mgeneral-regs-only is effectively "+nofp for the compiler without
changing the assembler's ISA flags". Currently that's implemented
by making TARGET_FLOAT, TARGET_SIMD and TARGET_SVE depend on
!TARGET_GENERAL_REGS_ONLY and then making any feature that needs FP
registers depend (directly or indirectly) on one of those three TARGET
macros. The problem is that it's easy to forgot to do the last bit.
This patch instead represents the distinction between "assemnbler
ISA flags" and "compiler ISA flags" more directly, funnelling
all updates through a new function that sets both sets of flags
together.
gcc/
* config/aarch64/aarch64.opt (aarch64_asm_isa_flags): New variable.
* config/aarch64/aarch64.h (aarch64_asm_isa_flags)
(aarch64_isa_flags): Redefine as read-only macros.
(TARGET_SIMD, TARGET_FLOAT, TARGET_SVE): Don't depend on
!TARGET_GENERAL_REGS_ONLY.
* common/config/aarch64/aarch64-common.cc
(aarch64_set_asm_isa_flags): New function.
(aarch64_handle_option): Call it when updating -mgeneral-regs.
* config/aarch64/aarch64-protos.h (aarch64_simd_switcher): Replace
m_old_isa_flags with m_old_asm_isa_flags.
(aarch64_set_asm_isa_flags): Declare.
* config/aarch64/aarch64-builtins.cc
(aarch64_simd_switcher::aarch64_simd_switcher)
(aarch64_simd_switcher::~aarch64_simd_switcher): Save and restore
aarch64_asm_isa_flags instead of aarch64_isa_flags.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
* config/aarch64/aarch64.cc (aarch64_set_asm_isa_flags): New function.
(aarch64_override_options, aarch64_handle_attr_arch)
(aarch64_handle_attr_cpu, aarch64_handle_attr_isa_flags): Use
aarch64_set_asm_isa_flags to set the ISA flags.
(aarch64_option_print, aarch64_declare_function_name)
(aarch64_start_file): Use aarch64_asm_isa_flags instead
of aarch64_isa_flags.
(aarch64_can_inline_p): Check aarch64_asm_isa_flags as well as
aarch64_isa_flags.
|
|
After previous changes, it's more convenient if the flags_on and
flags_off fields of all_extensions include the feature flag itself.
gcc/
* common/config/aarch64/aarch64-common.cc (all_extensions):
Include the feature flag in flags_on and flags_off.
(aarch64_parse_extension): Update accordingly.
(aarch64_get_extension_string_for_isa_flags): Likewise.
|
|
A previous patch added a aarch64_feature_flags typedef, to abstract
the representation of the feature flags. This patch makes existing
code use the typedef too. Hope I've caught them all!
gcc/
* common/config/aarch64/aarch64-common.cc: Use aarch64_feature_flags
for feature flags throughout.
* config/aarch64/aarch64-protos.h: Likewise.
* config/aarch64/aarch64-sve-builtins.h: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/aarch64/aarch64.opt: Likewise.
* config/aarch64/driver-aarch64.cc: Likewise.
|
|
Some of the option structures have all-const member variables.
That doesn't seem necessary: we can just use const on the objects
that are supposed to be read-only.
Also, with the new, more C++-heavy option handling, it seems
better to use constexpr for the static data, to make sure that
we're not adding unexpected overhead.
gcc/
* common/config/aarch64/aarch64-common.cc (aarch64_option_extension)
(processor_name_to_arch, arch_to_arch_name): Remove const from
member variables.
(all_extensions, all_cores, all_architectures): Make a constexpr.
* config/aarch64/aarch64.cc (processor): Remove const from
member variables.
(all_architectures): Make a constexpr.
* config/aarch64/driver-aarch64.cc (aarch64_core_data)
(aarch64_arch_driver_info): Remove const from member variables.
(aarch64_cpu_data, aarch64_arches): Make a constexpr.
(get_arch_from_id): Return a pointer to const.
(host_detect_local_cpu): Update accordingly.
|
|
Just a minor patch to avoid having to construct std::strings
in static data.
gcc/
* common/config/aarch64/aarch64-common.cc (processor_name_to_arch)
(arch_to_arch_name): Use const char * instead of std::string.
|
|
aarch64-common.cc has two arrays, one maintaining the original
definition order and one sorted by population count. Sorting
by population count was a way of ensuring topological ordering,
taking advantage of the fact that the entries are partially
ordered by the subset relation. However, the sorting is not
needed now that the .def file is forced to have topological
order from the outset.
Other changes are:
(1) The population count used:
uint64_t total_flags_a = opt_a->flag_canonical & opt_a->flags_on;
uint64_t total_flags_b = opt_b->flag_canonical & opt_b->flags_on;
int popcnt_a = popcount_hwi ((HOST_WIDE_INT)total_flags_a);
int popcnt_b = popcount_hwi ((HOST_WIDE_INT)total_flags_b);
where I think the & was supposed to be |. This meant that the
counts would always be 1 in practice, since flag_canonical is
a single bit. This led us to printing +nofp+nosimd even though
GCC "knows" (and GAS agrees) that +nofp disables simd.
(2) The .arch output code converts +aes+sha2 to +crypto. I think
the main reason for doing this is to support assemblers that
predate the individual per-feature crypto flags. It therefore
seems more natural to treat it as a special case, rather than
as an instance of a general pattern. Hopefully we won't do
something similar in future!
(There is already special handling of CRC, for different reasons.)
(3) Previously, if the /proc/cpuinfo code saw a feature like sve,
it would assume the presence of all the features that sve
depends on. It would be possible to keep that behaviour
if necessary, but it was simpler to assume the presence of
fp16 (say) only when fphp is present. There's an argument
that that's more conservatively correct too.
gcc/
* common/config/aarch64/aarch64-common.cc
(TARGET_OPTION_INIT_STRUCT): Delete.
(aarch64_option_extension): Remove is_synthetic_flag.
(all_extensions): Update accordingly.
(all_extensions_by_on, opt_ext, opt_ext_cmp): Delete.
(aarch64_option_init_struct, aarch64_contains_opt): Delete.
(aarch64_get_extension_string_for_isa_flags): Rewrite to use
all_extensions instead of all_extensions_on.
gcc/testsuite/
* gcc.target/aarch64/cpunative/info_8: Add all dependencies of sve.
* gcc.target/aarch64/cpunative/info_9: Likewise svesm4.
* gcc.target/aarch64/cpunative/info_15: Likewise.
* gcc.target/aarch64/cpunative/info_16: Likewise sve2.
* gcc.target/aarch64/cpunative/info_17: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_2.c: Expect just +nofp
rather than +nofp+nosimd.
* gcc.target/aarch64/cpunative/native_cpu_10.c: Likewise.
* gcc.target/aarch64/target_attr_15.c: Likewise.
|
|
Currently the aarch64-option-extensions.def entries, the
aarch64-cores.def entries, and the AARCH64_FL_FOR_* macros
have a transitive closure of dependencies that is maintained by hand.
This is a bit error-prone and is becoming less tenable as more features
are added. The main point of this patch is to maintain the closure
automatically instead.
For example, the +sve2-aes extension requires sve2 and aes.
This is now described using:
AARCH64_OPT_EXTENSION("sve2-aes", SVE2_AES, (SVE2, AES), ...)
If life was simple, we could just give the name of the feature
and the list of features that it requires/depends on. But sadly
things are more complicated. For example:
- the legacy +crypto option enables aes and sha2 only, but +nocrypto
disables all crypto-related extensions, including sm4.
- +fp16fml enables fp16, but armv8.4-a enables fp16fml without fp16.
fp16fml only has an effect when fp16 is also present; see the
comments for more details.
- +bf16 enables simd, but +bf16+nosimd is valid and enables just the
scalar bf16 instructions. rdma behaves similarly.
To handle cases like these, the option entries have extra fields to
specify what an explicit +foo enables and what an explicit +nofoo
disables, in addition to the absolute dependencies.
The other main changes are:
- AARCH64_FL_* are now defined automatically.
- the feature list for each architecture level moves from aarch64.h
to aarch64-arches.def.
As a consequence, we now have a (redundant) V8A feature flag.
While there, the patch uses a new typedef, aarch64_feature_flags,
for the set of feature flags. This should make it easier to switch
to a class if we run out of bits in the uint64_t.
For now the patch hardcodes the fact that crypto is the only
synthetic option. A later patch will remove this field.
To test for things that might not be covered by the testsuite,
I made the driver print out the all_extensions, all_cores and
all_archs arrays before and after the patch, with the following
tweaks:
- renumber the old AARCH64_FL_* bit assignments to match the .def order
- remove the new V8A flag when printing the new tables
- treat CRYPTO and CRYPTO | AES | SHA2 the same way when printing the
core tables
(On the last point: some cores enabled just CRYPTO while others enabled
CRYPTO, AES and SHA2. This doesn't cause a difference in behaviour
because of how the dependent macros are defined. With the new scheme,
all entries with CRYPTO automatically get AES and SHA2 too.)
The only difference is that +nofp now turns off dotprod. This was
another instance of an incomplete transitive closure, but unlike the
instances fixed in a previous patch, it had no observable effect.
gcc/
* config/aarch64/aarch64-option-extensions.def: Switch to a new format.
* config/aarch64/aarch64-cores.def: Use the same format to specify
lists of features.
* config/aarch64/aarch64-arches.def: Likewise, moving that information
from aarch64.h.
* config/aarch64/aarch64-opts.h (aarch64_feature_flags): New typedef.
* config/aarch64/aarch64.h (aarch64_feature): New class enum.
Turn AARCH64_FL_* macros into constexprs, getting the definitions
from aarch64-option-extensions.def. Remove AARCH64_FL_FOR_* macros.
* common/config/aarch64/aarch64-common.cc: Include
aarch64-feature-deps.h.
(all_extensions): Update for new .def format.
(all_extensions_by_on, all_cores, all_architectures): Likewise.
* config/aarch64/driver-aarch64.cc: Include aarch64-feature-deps.h.
(aarch64_extensions): Update for new .def format.
(aarch64_cpu_data, aarch64_arches): Likewise.
* config/aarch64/aarch64.cc: Include aarch64-feature-deps.h.
(all_architectures, all_cores): Update for new .def format.
* config/aarch64/aarch64-sve-builtins.cc
(check_required_extensions): Likewise.
|
|
The flags fields of the aarch64-cores.def always start with
AARCH64_FL_FOR_<ARCH>. After previous changes, <ARCH> is always
identical to the previous field, so we can drop the explicit
AARCH64_FL_FOR_<ARCH> and derive it programmatically.
This isn't a big saving in itself, but it helps with later patches.
gcc/
* config/aarch64/aarch64-cores.def: Remove AARCH64_FL_FOR_<ARCH>
from the flags field.
* common/config/aarch64/aarch64-common.cc (all_cores): Add it
here instead.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (all_cores): Likewise.
|
|
This patch completes the renaming of architecture-level related
things by adding "V" to the name of the architecture in
aarch64-arches.def. Since the "V" is predictable, we can easily
drop it when we don't need it (as when matching /proc/cpuinfo).
Having a valid C identifier is necessary for later patches.
gcc/
* config/aarch64/aarch64-arches.def: Add a leading "V" to the
ARCH_IDENT fields.
* config/aarch64/aarch64-cores.def: Update accordingly.
* common/config/aarch64/aarch64-common.cc (all_cores): Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.
* config/aarch64/driver-aarch64.cc (aarch64_arches): Skip the
leading "V".
|
|
This patch renames AARCH64_FL_FOR_ARCH* macros to follow the
same V<number><profile> names that we (now) use elsewhere.
The names are only temporary -- a later patch will move the
information to the .def file instead. However, it helps with
the sequencing to do this first.
gcc/
* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8): Rename to...
(AARCH64_FL_FOR_V8A): ...this.
(AARCH64_FL_FOR_ARCH8_1): Rename to...
(AARCH64_FL_FOR_V8_1A): ...this.
(AARCH64_FL_FOR_ARCH8_2): Rename to...
(AARCH64_FL_FOR_V8_2A): ...this.
(AARCH64_FL_FOR_ARCH8_3): Rename to...
(AARCH64_FL_FOR_V8_3A): ...this.
(AARCH64_FL_FOR_ARCH8_4): Rename to...
(AARCH64_FL_FOR_V8_4A): ...this.
(AARCH64_FL_FOR_ARCH8_5): Rename to...
(AARCH64_FL_FOR_V8_5A): ...this.
(AARCH64_FL_FOR_ARCH8_6): Rename to...
(AARCH64_FL_FOR_V8_6A): ...this.
(AARCH64_FL_FOR_ARCH8_7): Rename to...
(AARCH64_FL_FOR_V8_7A): ...this.
(AARCH64_FL_FOR_ARCH8_8): Rename to...
(AARCH64_FL_FOR_V8_8A): ...this.
(AARCH64_FL_FOR_ARCH8_R): Rename to...
(AARCH64_FL_FOR_V8R): ...this.
(AARCH64_FL_FOR_ARCH9): Rename to...
(AARCH64_FL_FOR_V9A): ...this.
(AARCH64_FL_FOR_ARCH9_1): Rename to...
(AARCH64_FL_FOR_V9_1A): ...this.
(AARCH64_FL_FOR_ARCH9_2): Rename to...
(AARCH64_FL_FOR_V9_2A): ...this.
(AARCH64_FL_FOR_ARCH9_3): Rename to...
(AARCH64_FL_FOR_V9_3A): ...this.
* common/config/aarch64/aarch64-common.cc (all_cores): Update
accordingly.
* config/aarch64/aarch64-arches.def: Likewise.
* config/aarch64/aarch64-cores.def: Likewise.
* config/aarch64/aarch64.cc (all_cores): Likewise.
|
|
All AARCH64_ISA_* architecture-level macros except AARCH64_ISA_V8_R
are for the A profile: they cause __ARM_ARCH_PROFILE to be set to
'A' and they are associated with architecture names like armv8.4-a.
It's convenient for later patches if we make this explicit
by adding an "A" to the name. Also, rather than add an underscore
(as for V8_R) it's more convenient to add the profile directly
to the number, like we already do in the ARCH_IDENT field of the
aarch64-arches.def entries.
gcc/
* config/aarch64/aarch64.h (AARCH64_ISA_V8_2, AARCH64_ISA_V8_3)
(AARCH64_ISA_V8_4, AARCH64_ISA_V8_5, AARCH64_ISA_V8_6)
(AARCH64_ISA_V9, AARCH64_ISA_V9_1, AARCH64_ISA_V9_2)
(AARCH64_ISA_V9_3): Add "A" to the end of the name.
(AARCH64_ISA_V8_R): Rename to AARCH64_ISA_V8R.
(TARGET_ARMV8_3, TARGET_JSCVT, TARGET_FRINT, TARGET_MEMTAG): Update
accordingly.
* common/config/aarch64/aarch64-common.cc
(aarch64_get_extension_string_for_isa_flags): Likewise.
* config/aarch64/aarch64-c.cc
(aarch64_define_unconditional_macros): Likewise.
|
|
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Change "static void" to "void".
* config.gcc: Add riscv-selftests.o
* config/riscv/predicates.md: Allow const_poly_int.
* config/riscv/riscv-protos.h (riscv_reinit): New function.
(riscv_parse_arch_string): change as exten function.
(riscv_run_selftests): New function.
* config/riscv/riscv.cc (riscv_cannot_force_const_mem): Don't allow poly
into const pool.
(riscv_report_v_required): New function.
(riscv_expand_op): New function.
(riscv_expand_mult_with_const_int): New function.
(riscv_legitimize_poly_move): Ditto.
(riscv_legitimize_move): New function.
(riscv_hard_regno_mode_ok): Add VL/VTYPE register allocation and fix
vector RA.
(riscv_convert_vector_bits): Fix riscv_vector_chunks configuration for
-marh no 'v'.
(riscv_reinit): New function.
(TARGET_RUN_TARGET_SELFTESTS): New target hook support.
* config/riscv/t-riscv: Add riscv-selftests.o.
* config/riscv/riscv-selftests.cc: New file.
gcc/testsuite/ChangeLog:
* selftests/riscv/empty-func.rtl: New test.
|
|
../../gcc/common/config/riscv/riscv-common.cc: In function 'const char* riscv_multi_lib_check(int, const char**)':
../../gcc/common/config/riscv/riscv-common.cc:1451:11: error: bare apostrophe ''' in format [-Werror=format-diag]
1451 | "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
| ^
../../gcc/common/config/riscv/riscv-common.cc:1451:7: note: if avoiding the apostrophe is not feasible, enclose it in a pair of '%<' and '%>' directives instead
1451 | "Can't find suitable multilib set for %<-march=%s%>/%<-mabi=%s%>",
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../gcc/common/config/riscv/riscv-common.cc: At global scope:
../../gcc/common/config/riscv/riscv-common.cc:1492:1: error: 'int riscv_check_conds(const switchstr*, int, int, const std::vector<std::__cxx11::basic_string<char> >&)' defined but not used [-Werror=unused-function]
1492 | riscv_check_conds (
| ^~~~~~~~~~~~~~~~~
../../gcc/common/config/riscv/riscv-common.cc:1374:1: error: 'const char* find_last_appear_switch(const switchstr*, int, const char*)' defined but not used [-Werror=unused-function]
1374 | find_last_appear_switch (
| ^~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:2442: riscv-common.o] Error 1
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc (RISCV_USE_CUSTOMISED_MULTI_LIB):
Move forward for cover all all necessary functions for suppress
unused function warnings.
(riscv_multi_lib_check): Move forward, and tweak message to suppress
-Werror=format-diag warning.
|