aboutsummaryrefslogtreecommitdiff
path: root/gcc/config
AgeCommit message (Collapse)AuthorFilesLines
2021-01-02Darwin : Adjust defaults for the linker.Iain Sandoe1-4/+8
Ideally, the linker will be queried for its version and that will be used to determine capabilities that cannot be discovered from reasonable configuration testing. When building cross tools, this might not be possible, and we have strategies for providing useful defaults. These are adjusted here to refect current choices. gcc/ChangeLog: * config/darwin.h (MIN_LD64_NO_COAL_SECTS): Adjust. Amend handling for LD64_VERSION fallback defaults.
2021-01-02Darwin, Simplify headers 4/5 : Remove redundant headers.Iain Sandoe4-97/+0
The darwinN.h headers (with the sole exception of darwin7.h, which contains a target macro definition) now only contain values that set fall-backs for cross-compilations, these can be provided from the config.gcc script which means we no longer need the darwinN.h - so delete them. gcc/ChangeLog: * config.gcc: Compute default version information from the configured target. Likewise defaults for ld64. * config/darwin10.h: Removed. * config/darwin12.h: Removed. * config/darwin9.h: Removed. * config/rs6000/darwin8.h: Removed.
2021-01-02Darwin, Simplify headers 3/5 : Delete dead code.Iain Sandoe1-11/+0
Darwin defines ASM_OUTPUT_ALIGNED_DECL_COMMON which is used in preference to ASM_OUTPUT_ALIGNED_COMMON, which makes the latter definition dead code. Remove this. gcc/ChangeLog: * config/darwin9.h (ASM_OUTPUT_ALIGNED_COMMON): Delete.
2021-01-02Darwin, Simplify headers 2/5 : Move spec for STACK_CHECK_STATIC_BUILTIN.Iain Sandoe2-3/+3
We now need a modern (C++11) toolchain to bootstrap GCC, so there's no need to skip the stack protect for Darwin < 9. gcc/ChangeLog: * config/darwin9.h (STACK_CHECK_STATIC_BUILTIN): Move from here.. * config/darwin.h (STACK_CHECK_STATIC_BUILTIN): .. to here.
2021-01-02Darwin, Simplify headers 1/5 : Move LINK_GCC_C_SEQUENCE_SPEC [NFC].Iain Sandoe2-11/+7
There is no need to make the LINK_GCC_C_SEQUENCE_SPEC conditional on configuration parameters, it is adequately conditionalized on the macosx-version-min. gcc/ChangeLog: * config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move from here... * config/darwin.h (LINK_GCC_C_SEQUENCE_SPEC): ... to here.
2021-01-02Darwin, Simplify headers 0/5 : Move spec for Darwin 10 unwind stub [NFC].Iain Sandoe2-1/+1
The darwinN.h headers were (presumably) introduced to allow specs to be adjusted when there was no mmacosx-version-min handling, or that was considered unreliable. We have version-specific specs for the values that have configuration data, and the version is set in the driver (so may be considered reliably present). Some of the 'darwinN.h' content has become dead code, and the reminder is either conditionalised on version information (or is setting values used as fall-backs in cross-compilations). With the changes needed for Darwin20 / macOS 11 the 'darwnN.h' headers are now too unwieldy to be useful - so this series moves the relevant specs definitons to the common 'darwin.h' header and then finally uses the config.gcc script to supply the fall-back defaults for cross- compilations. We can then delete all but the main header, since the darwinN.h are unused. This change moves a spec from darwin10.h to the main darwin.h target header. gcc/ChangeLog: * config/darwin10.h (LINK_GCC_C_SEQUENCE_SPEC): Move the spec for the Darwin10 unwinder stub from here ... * config/darwin.h (LINK_COMMAND_SPEC_A): ... to here.
2021-01-02Darwin : Adjust defaults for current bootstrap constraints.Iain Sandoe2-37/+26
The toolchain now requires a C++11 compiler to bootstrap and none of the older Darwin toolchains which were based on stabs debugging are suitable. We can simplify the debug setup now. gcc/ChangeLog: * config/darwin.h (DSYMUTIL_SPEC): Default to DWARF (ASM_DEBUG_SPEC):Only define if the assembler supports stabs. (PREFERRED_DEBUGGING_TYPE): Default to DWARF. (DARWIN_PREFER_DWARF): Define. * config/darwin9.h (PREFERRED_DEBUGGING_TYPE): Remove. (DARWIN_PREFER_DWARF): Likewise (DSYMUTIL_SPEC): Likewise. (COLLECT_RUN_DSYMUTIL): Likewise. (ASM_DEBUG_SPEC): Likewise. (ASM_DEBUG_OPTION_SPEC): Likewise.
2020-12-30i386: Remove unnecessary clobbers from combine splitters.Uros Bizjak1-37/+24
There is no need for combine splitters to emit insn patterns with clobbers, the pass is smart enough to add clobbers to patterns as necessary. 2020-12-30 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/i386.md: Remove unnecessary clobbers from combine splitters.
2020-12-30i386: Optimize pmovmskb on inverted vector to inversion of pmovmskb result ↵Jakub Jelinek1-0/+47
[PR98461] The following patch adds combine splitters to optimize: - vpcmpeqd %ymm1, %ymm1, %ymm1 - vpandn %ymm1, %ymm0, %ymm0 vpmovmskb %ymm0, %eax + notl %eax etc. (for vectors with less than 32 elements with xorl instead of notl). 2020-12-30 Jakub Jelinek <jakub@redhat.com> PR target/98461 * config/i386/sse.md (<sse2_avx2>_pmovmskb): Add splitters for pmovmskb of NOT vector. * gcc.target/i386/sse2-pr98461.c: New test. * gcc.target/i386/avx2-pr98461.c: New test.
2020-12-29arc: generate mac(u) insn instead of macd(u) when destination is acclClaudiu Zissulescu1-10/+14
Generate MAC(U) instruction instead of MACD(U) when the destination register is already choosen as ACCL register. gcc/ 2020-12-29 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (maddsidi4_split): Skip macd gen, use mac insn instead. (macd): Update register letters. (umaddsidi4_split): Skip macdu gen, use macu insn instead. (macdu): Update register letters. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2020-12-29arc: flip if-condition predicates in secondary reload hookClaudiu Zissulescu1-1/+1
The ARC code contains code which should only work with the old reload pass. Such code is found in arc_secondary_reload hook, however it was not properly quarded. Reverse the if-condition predicate such that req_equiv_mem is called when lra is not in progress. gcc/ 2020-12-29 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.c (arc_secondary_reload): Flip if-condition predicates. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2020-12-29arc: Make use reg_renumber safe.Claudiu Zissulescu1-1/+1
The REGNO_OK_FOR_BASE_P is using reg_renumber array. However, it is not always defined. Use it only when it is defined. gcc/ 2020-12-29 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.h (REGNO_OK_FOR_BASE_P): Check if defined reg_renumber. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2020-12-29arc: Fix cached to uncached moves.Claudiu Zissulescu1-2/+10
We need an temporary register when moving data from a cached memory to an uncached memory. Fix this issue and add a test for it. gcc/ 2020-12-29 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.c (prepare_move_operands): Use a temporary registers when we have cached mem-to-uncached mem moves. gcc/testsuite/ 2020-12-29 Vladimir Isaev <isaev@synopsys.com> * gcc.target/arc/uncached-9.c: New test. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2020-12-29arc: Don't use predicated vadd2 instructions in mov patterns.Claudiu Zissulescu2-5/+5
Update movdi, movdf and mov vectors not to use predicated vadd2 instructions. vadd2 is used as a "fast" move in these patterns. This fixes a number of failures in dejagnu. gcc/ 2020-12-29 Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (movdi_insn): Update pattern, no predicated vadd2 usage. (movdf_insn): Likewise. * config/arc/simdext.md (movVEC_insn): Likewise. Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2020-12-29i386: Rounding functions TLCUros Bizjak1-66/+64
Use copy_to_reg where appropriate, use int_mode_for_mode and fix comment indentation. 2020-12-29 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/i386-expand.c (ix86_gen_TWO52): Use REAL_MODE_FORMAT to determine number of mantissa bits. Use real_2expN instead of real_ldexp. (ix86_expand_rint): Use copy_to_reg. (ix86_expand_floorceildf_32): Ditto. (ix86_expand_truncdf_32): Ditto. (ix86_expand_rounddf_32): Ditto. (ix86_expand_floorceil): Use copy_to_reg and int_mode_for_mode. (ix86_expand_trunc): Ditto. (ix86_expand_round): Ditto.
2020-12-28i386: Fix __builtin_rint with FE_DOWNWARD rounding direction [PR96793]Uros Bizjak1-7/+9
x86_expand_rint expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive fails to change the sign of the intermediate value, assumed to always be positive, back to positive. The patch adds one extra fabs that strips the sign from the intermediate value when flag_rounding_math is in effect. 2020-12-28 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/96793 * config/i386/i386-expand.c (ix86_expand_rint): Remove the sign of the intermediate value for flag_rounding_math. gcc/testsuite/ PR target/96793 * gcc.target/i386/pr96793-2.c: New test.
2020-12-28i386: Use existing temporary register in rounding functionsUros Bizjak1-5/+7
It is possible to avoid the call to force_reg and use existing temporary register in ix86_expand_trunc, ix86_expand_round and ix86_expand_rounddf_32 expanders. 2020-12-28 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/i386-expand.c (ix86_expand_trunc): Use existing temporary register to avoid a call to force_reg.
2020-12-28Fix standard name for zero/sign extend expandersHongyu Wang2-18/+22
gcc/ChangeLog: * config/i386/i386.md (optab): New code attr. * config/i386/sse.md (<code>v32qiv32hi2): Rename to ... (<optab>v32qiv32hi2) ... this. (<code>v16qiv16hi2): Likewise. (<code>v8qiv8hi2): Likewise. (<code>v16qiv16si2): Likewise. (<code>v8qiv8si2): Likewise. (<code>v4qiv4si2): Likewise. (<code>v16hiv16si2): Likewise. (<code>v8hiv8si2): Likewise. (<code>v4hiv4si2): Likewise. (<code>v8qiv8di2): Likewise. (<code>v4qiv4di2): Likewise. (<code>v2qiv2di2): Likewise. (<code>v8hiv8di2): Likewise. (<code>v4hiv4di2): Likewise. (<code>v2hiv2di2): Likewise. (<code>v8siv8di2): Likewise. (<code>v4siv4di2): Likewise. (<code>v2siv2di2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/pr92658-avx2-2.c: New test. * gcc.target/i386/pr92658-avx512bw-2.c: Likewise. * gcc.target/i386/pr92658-sse4-2.c: Likewise.
2020-12-24RISC-V: Fix python3 compatibility for multilib-generatorKito Cheng1-1/+1
The subprocess return string is raw bytes in python3, it must decode before used as string, verifed with python2 and python3. gcc/ChangeLog: * config/riscv/multilib-generator (arch_canonicalize): Call decode for the subprocess return value.
2020-12-23Darwin : Adjust handling of MACOSX_DEPLOYMENT_TARGET for macOS 11.Iain Sandoe1-7/+16
The shift to macOS version 11 also means that '11' without any following '.x' is accepted as a valid version number. This adjusts the validation code to accept this and map it to 11.0.0 which matches what the clang toolchain appears to do. gcc/ChangeLog: * config/darwin-driver.c (validate_macosx_version_min): Allow MACOSX_DEPLOYMENT_TARGET=11. (darwin_default_min_version): Adjust warning spelling to avoid an apostrophe.
2020-12-23i386: Fix __builtin_trunc with FE_DOWNWARD rounding direction [PR96793]Uros Bizjak1-13/+14
x86_expand_truncdf_32 expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive fails to change the sign of the intermediate value, assumed to always be positive, back to positive. The patch adds one extra fabs that strips the sign from the intermediate value when flag_rounding_math is in effect. 2020-12-23 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/96793 * config/i386/i386-expand.c (ix86_expand_truncdf_32): Remove the sign of the intermediate value for flag_rounding_math. gcc/testsuite/ PR target/96793 * gcc.target/i386/pr96793-1.c: New test.
2020-12-22arm&aarch64: subdivide the type attribute "alu_shfit_imm"Qian Jianhua38-62/+170
The type attribute "alu_shfit_imm" is subdivided into "alu_shift_imm_lsl_1to4" and "alu_shift_imm_other", to accommodate optimazations of some microarchitectures. Here is the detailed discussion. https://gcc.gnu.org/pipermail/gcc/2020-September/233594.html gcc/ * config/arm/types.md (define_attr "autodetect_type"): New. (define_attr "type"): Subdivide alu_shift_imm. * config/arm/common.md: New file. * config/aarch64/predicates.md:Include common.md. * config/arm/predicates.md:Include common.md. * config/aarch64/aarch64.md (*add_<shift>_<mode>): Set autodetect_type. (*add_<shift>_si_uxtw): Likewise. (*sub_<shift>_<mode>): Likewise. (*sub_<shift>_si_uxtw): Likewise. (*neg_<shift>_<mode>2): Likewise. (*neg_<shift>_si2_uxtw): Likewise. * config/arm/arm.md (*addsi3_carryin_shift): Likewise. (add_not_shift_cin): Likewise. (*subsi3_carryin_shift): Likewise. (*subsi3_carryin_shift_alt): Likewise. (*rsbsi3_carryin_shift): Likewise. (*rsbsi3_carryin_shift_alt): Likewise. (*arm_shiftsi3): Likewise. (*<arith_shift_insn>_multsi): Likewise. (*<arith_shift_insn>_shiftsi): Likewise. (subsi3_carryin): Set new type. (*if_arith_move): Set new type. (*if_move_arith): Set new type. (define_attr "core_cycles"): Use new type. * config/arm/arm-fixed.md (arm_ssatsihi_shift): Set autodetect_type. * config/arm/thumb2.md (*orsi_not_shiftsi_si): Likewise. (*thumb2_shiftsi3_short): Set new type. * config/aarch64/falkor.md (falkor_alu_1_xyz): Use new type. * config/aarch64/saphira.md (saphira_alu_1_xyz): Likewise. * config/aarch64/thunderx.md (thunderx_arith_shift): Likewise. * config/aarch64/thunderx2t99.md (thunderx2t99_alu_shift): Likewise. * config/aarch64/thunderx3t110.md (thunderx3t110_alu_shift): Likewise. (thunderx3t110_alu_shift1): Likewise. * config/aarch64/tsv110.md (tsv110_alu_shift): Likewise. * config/arm/arm1020e.md (1020alu_shift_op): Likewise. * config/arm/arm1026ejs.md (alu_shift_op): Likewise. * config/arm/arm1136jfs.md (11_alu_shift_op): Likewise. * config/arm/arm926ejs.md (9_alu_op): Likewise. * config/arm/cortex-a15.md (cortex_a15_alu_shift): Likewise. * config/arm/cortex-a17.md (cortex_a17_alu_shiftimm): Likewise. * config/arm/cortex-a5.md (cortex_a5_alu_shift): Likewise. * config/arm/cortex-a53.md (cortex_a53_alu_shift): Likewise. * config/arm/cortex-a57.md (cortex_a57_alu_shift): Likewise. * config/arm/cortex-a7.md (cortex_a7_alu_shift): Likewise. * config/arm/cortex-a8.md (cortex_a8_alu_shift): Likewise. * config/arm/cortex-a9.md (cortex_a9_dp_shift): Likewise. * config/arm/cortex-m4.md (cortex_m4_alu): Likewise. * config/arm/cortex-m7.md (cortex_m7_alu_shift): Likewise. * config/arm/cortex-r4.md (cortex_r4_alu_shift): Likewise. * config/arm/exynos-m1.md (exynos_m1_alu_shift): Likewise. * config/arm/fa526.md (526_alu_shift_op): Likewise. * config/arm/fa606te.md (606te_alu_op): Likewise. * config/arm/fa626te.md (626te_alu_shift_op): Likewise. * config/arm/fa726te.md (726te_alu_shift_op): Likewise. * config/arm/fmp626.md (mp626_alu_shift_op): Likewise. * config/arm/marvell-pj4.md (pj4_shift): Likewise. (pj4_shift_conds): Likewise. (pj4_alu_shift): Likewise. (pj4_alu_shift_conds): Likewise. * config/arm/xgene1.md (xgene1_alu): Likewise. * config/arm/arm.c (xscale_sched_adjust_cost): Likewise.
2020-12-22i386: Fix __builtin_floor with FE_DOWNWARD rounding direction [PR96793]Uros Bizjak1-5/+20
x86_expand_floorceil expander uses x86_sse_copysign_to_positive, which is unable to change the sign from - to +. When FE_DOWNWARD rounding direction is in effect, the expanded sequence that involves subtraction can trigger x - x = -0.0 special rule. x86_sse_copysign_to_positive fails to change the sign of the intermediate value, assumed to always be positive, back to positive. The patch adds one extra fabs that strips the sign from the intermediate value when flag_rounding_math is in effect. 2020-12-22 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/96793 * config/i386/i386-expand.c (ix86_expand_floorceil): Remove the sign of the intermediate value for flag_rounding_math. (ix86_expand_floorceildf_32): Ditto. gcc/testsuite/ PR target/96793 * gcc.target/i386/pr96793.c: New test.
2020-12-22Fix Typo.liuhongt1-1/+1
gcc/ChangeLog * config/i386/i386.md (*one_cmpl<mode>2_1): Fix typo, change alternative from 2 to 1 in attr isa.
2020-12-21Darwin : Update the kernel version to macOS version mapping.Iain Sandoe1-2/+15
With the change to macOS 11 and Darwin20, the algorithm for mapping kernel version to macOS version has changed. We now have darwin 20.X.Y => macOS 11.(X > 0 ? X - 1 : 0).??. It currently unclear if the Y will be mapped to macOS patch version and, if so, whether it will be one-based or 0-based. Likewise, it's unknown if Darwin 21 will map to macOS 12, so these entries are unchanged for the present. gcc/ChangeLog: * config/darwin-driver.c (darwin_find_version_from_kernel): Compute the minor OS version from the minor kernel version.
2020-12-20gcc: xtensa: implement bswapsi2, bswapdi2 and helpersMax Filippov1-0/+21
2020-12-20 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/xtensa.md (bswapsi2, bswapdi2): New patterns. gcc/testsuite/ * gcc.target/xtensa/bswap.c: New test. libgcc/ * config/xtensa/lib1funcs.S (__bswapsi2, __bswapdi2): New functions. * config/xtensa/t-xtensa (LIB1ASMFUNCS): Add _bswapsi2 and _bswapdi2.
2020-12-18aarch64: Extend aarch64-autovec-preference==2 to 128-bit SVERichard Sandiford1-4/+5
When compiling with -msve-vector-bits=128, aarch64_preferred_simd_mode would pass the same vector width to aarch64_simd_container_mode for both SVE and Advanced SIMD, and so Advanced SIMD would always “win”. This patch instead makes it choose directly between SVE and Advanced SIMD modes, so that aarch64-autovec-preference==2 and aarch64-autovec-preference==4 work for this configuration. (aarch64-autovec-preference shouldn't affect aarch64_simd_container_mode because that would have an ABI impact for things like GNU vectors.) gcc/ * config/aarch64/aarch64.c (aarch64_preferred_simd_mode): Use aarch64_full_sve_mode and aarch64_vq_mode directly, instead of going via aarch64_simd_container_mode.
2020-12-18Arm: MVE: Add missing complex mul iteratorsTamar Christina1-0/+4
Seems when I split the patch I forgot to include these into the rot iterator.. The uncommitted hunks were still in my local tree so didn't notice. gcc/ChangeLog: * config/arm/iterators.md (rot): Add UNSPEC_VCMUL, UNSPEC_VCMUL90, UNSPEC_VCMUL180, UNSPEC_VCMUL270.
2020-12-17arm: Add support for Cortex-A78CPrzemyslaw Wirkus3-5/+19
This patch adds support for -mcpu=cortex-a78c command line option. For more information about this processor, see [0]: [0] https://developer.arm.com/ip-products/processors/cortex-a/cortex-a78c gcc/ChangeLog: * config/arm/arm-cpus.in: Add Cortex-A78C core. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * doc/invoke.texi: Update docs.
2020-12-17vect, aarch64: Extend SVE vs Advanced SIMD costing decisions in ↵Kyrylo Tkachov1-9/+26
vect_better_loop_vinfo_p While experimenting with some backend costs for Advanced SIMD and SVE I hit many cases where GCC would pick SVE for VLA auto-vectorisation even when the backend very clearly presented cheaper costs for Advanced SIMD. For a simple float addition loop the SVE costs were: vec.c:9:21: note: Cost model analysis: Vector inside of loop cost: 28 Vector prologue cost: 2 Vector epilogue cost: 0 Scalar iteration cost: 10 Scalar outside cost: 0 Vector outside cost: 2 prologue iterations: 0 epilogue iterations: 0 Minimum number of vector iterations: 1 Calculated minimum iters for profitability: 4 and for Advanced SIMD (Neon) they're: vec.c:9:21: note: Cost model analysis: Vector inside of loop cost: 11 Vector prologue cost: 0 Vector epilogue cost: 0 Scalar iteration cost: 10 Scalar outside cost: 0 Vector outside cost: 0 prologue iterations: 0 epilogue iterations: 0 Calculated minimum iters for profitability: 0 vec.c:9:21: note: Runtime profitability threshold = 4 yet the SVE one was always picked. With guidance from Richard this seems to be due to the vinfo comparisons in vect_better_loop_vinfo_p, in particular the part with the big comment explaining the estimated_rel_new * 2 <= estimated_rel_old heuristic. This patch extends the comparisons by introducing a three-way estimate kind for poly_int values that the backend can distinguish. This allows vect_better_loop_vinfo_p to ask for minimum, maximum and likely estimates and pick Advanced SIMD overs SVE when it is clearly cheaper. gcc/ * target.h (enum poly_value_estimate_kind): Define. (estimated_poly_value): Take an estimate kind argument. * target.def (estimated_poly_value): Update definition for the above. * doc/tm.texi: Regenerate. * targhooks.c (estimated_poly_value): Update prototype. * tree-vect-loop.c (vect_better_loop_vinfo_p): Use min, max and likely estimates of VF to pick between vinfos. * config/aarch64/aarch64.c (aarch64_cmp_autovec_modes): Use estimated_poly_value instead of aarch64_estimated_poly_value. (aarch64_estimated_poly_value): Take a kind argument and handle it.
2020-12-17arm: Fix bootstrapAndrea Corallo1-1/+1
gcc/ChangeLog 2020-12-17 Andrea Corallo <andrea.corallo@arm.com> * config/arm/arm_neon.h (vcreate_p64): Remove call to '__builtin_neon_vcreatedi'.
2020-12-16gcc: xtensa: add optimizations for shift operationsTakayuki 'January June' Suwa1-0/+43
2020-12-16 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> gcc/ * config/xtensa/xtensa.md (*ashlsi3_1, *ashlsi3_3x, *ashrsi3_3x) (*lshrsi3_3x): New patterns. gcc/testsuite/ * gcc.target/xtensa/shifts.c: New test.
2020-12-16rs6000: Add support for powerpc64le-unknown-freebsdPiotr Kubaj1-3/+9
This implements support for powerpc64le architecture on FreeBSD. Since we don't have powerpcle (32-bit), I did not add support for powerpcle here. This remains to be changed if there is powerpcle support in the future. 2020-12-15 Piotr Kubaj <pkubaj@FreeBSD.org> gcc/ * config.gcc (powerpc*le-*-freebsd*): Add. * configure.ac (powerpc*le-*-freebsd*): Ditto. * configure: Regenerate. * config/rs6000/freebsd64.h (ASM_SPEC_COMMON): Use ENDIAN_SELECT. (DEFAULT_ASM_ENDIAN): Add little endian support. (LINK_OS_FREEBSD_SPEC64): Ditto.
2020-12-16gcc: xtensa: rearrange DI mode constant loadingTakayuki 'January June' Suwa2-2/+32
2020-12-16 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> gcc/ * config/xtensa/xtensa.c (xtensa_emit_move_sequence): Try to replace 'l32r' with 'movi' + 'slli' when optimizing for size. * config/xtensa/xtensa.md (movdi): Split loading DI mode constant into register pair into two loads of SI mode constants.
2020-12-16Arm: MVE: Split refactoring of remaining complex instrinsicsTamar Christina5-138/+54
This refactors the complex numbers bits of MVE to go through the same unspecs as the NEON variant. This is pre-work to allow code to be shared between NEON and MVE for the complex vectorization patches. gcc/ChangeLog: * config/arm/arm_mve.h (__arm_vcmulq_rot90_f16): (__arm_vcmulq_rot270_f16, _arm_vcmulq_rot180_f16, __arm_vcmulq_f16, __arm_vcmulq_rot90_f32, __arm_vcmulq_rot270_f32, __arm_vcmulq_rot180_f32, __arm_vcmulq_f32, __arm_vcmlaq_f16, __arm_vcmlaq_rot180_f16, __arm_vcmlaq_rot270_f16, __arm_vcmlaq_rot90_f16, __arm_vcmlaq_f32, __arm_vcmlaq_rot180_f32, __arm_vcmlaq_rot270_f32, __arm_vcmlaq_rot90_f32): Update builtin calls. * config/arm/arm_mve_builtins.def (vcmulq_f, vcmulq_rot90_f, vcmulq_rot180_f, vcmulq_rot270_f, vcmlaq_f, vcmlaq_rot90_f, vcmlaq_rot180_f, vcmlaq_rot270_f): Removed. (vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270, vcmlaq, vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270): New. * config/arm/iterators.md (mve_rot): Add UNSPEC_VCMLA, UNSPEC_VCMLA90, UNSPEC_VCMLA180, UNSPEC_VCMLA270, UNSPEC_VCMUL, UNSPEC_VCMUL90, UNSPEC_VCMUL180, UNSPEC_VCMUL270. (VCMUL): New. * config/arm/mve.md (mve_vcmulq_f<mode, mve_vcmulq_rot180_f<mode>, mve_vcmulq_rot270_f<mode>, mve_vcmulq_rot90_f<mode>, mve_vcmlaq_f<mode>, mve_vcmlaq_rot180_f<mode>, mve_vcmlaq_rot270_f<mode>, mve_vcmlaq_rot90_f<mode>): Removed. (mve_vcmlaq<mve_rot><mode>, mve_vcmulq<mve_rot><mode>, mve_vcaddq<mve_rot><mode>, cadd<rot><mode>3, mve_vcaddq<mve_rot><mode>): New. * config/arm/unspecs.md (UNSPEC_VCMUL90, UNSPEC_VCMUL270, UNSPEC_VCMUL, UNSPEC_VCMUL180): New. (VCMULQ_F, VCMULQ_ROT180_F, VCMULQ_ROT270_F, VCMULQ_ROT90_F, VCMLAQ_F, VCMLAQ_ROT180_F, VCMLAQ_ROT90_F, VCMLAQ_ROT270_F): Removed.
2020-12-16Arm: Add NEON and MVE RTL patterns for Complex Addition.Tamar Christina7-72/+58
This adds implementation for the optabs for complex additions. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: add r3, r2, #1600 .L2: vld1.32 {q8}, [r0]! vld1.32 {q9}, [r1]! vcadd.f32 q8, q8, q9, #90 vst1.32 {q8}, [r2]! cmp r3, r2 bne .L2 bx lr instead of f90: add r3, r2, #1600 .L2: vld2.32 {d24-d27}, [r0]! vld2.32 {d20-d23}, [r1]! vsub.f32 q8, q12, q11 vadd.f32 q9, q13, q10 vst2.32 {d16-d19}, [r2]! cmp r3, r2 bne .L2 bx lr gcc/ChangeLog: * config/arm/arm_mve.h (__arm_vcaddq_rot90_u8, __arm_vcaddq_rot270_u8, __arm_vcaddq_rot90_s8, __arm_vcaddq_rot270_s8, __arm_vcaddq_rot90_u16, __arm_vcaddq_rot270_u16, __arm_vcaddq_rot90_s16, __arm_vcaddq_rot270_s16, __arm_vcaddq_rot90_u32, __arm_vcaddq_rot270_u32, __arm_vcaddq_rot90_s32, __arm_vcaddq_rot270_s32, __arm_vcaddq_rot90_f16, __arm_vcaddq_rot270_f16, __arm_vcaddq_rot90_f32, __arm_vcaddq_rot270_f32): Update builtin calls. * config/arm/arm_mve_builtins.def (vcaddq_rot90_u, vcaddq_rot270_u, vcaddq_rot90_s, vcaddq_rot270_s, vcaddq_rot90_f, vcaddq_rot270_f): Removed. (vcaddq_rot90, vcaddq_rot270): New. * config/arm/constraints.md (Dz): Include MVE. * config/arm/iterators.md (mve_rot): New. (supf): Remove VCADDQ_ROT270_S, VCADDQ_ROT270_U, VCADDQ_ROT90_S, VCADDQ_ROT90_U. (VCADDQ_ROT270, VCADDQ_ROT90): Removed. * config/arm/mve.md (mve_vcaddq_rot270_<supf><mode, mve_vcaddq_rot90_<supf><mode>, mve_vcaddq_rot270_f<mode>, mve_vcaddq_rot90_f<mode>): Removed. (mve_vcaddq<mve_rot><mode>, mve_vcaddq<mve_rot><mode>): New. * config/arm/unspecs.md (VCADDQ_ROT270_S, VCADDQ_ROT90_S, VCADDQ_ROT270_U, VCADDQ_ROT90_U, VCADDQ_ROT270_F, VCADDQ_ROT90_F): Removed. * config/arm/vec-common.md (cadd<rot><mode>3): New.
2020-12-16AArch64: Add NEON, SVE and SVE2 RTL patterns for Complex Addition.Tamar Christina4-0/+36
This adds implementation for the optabs for add complex operations. With this the following C code: void f90 (float complex a[restrict N], float complex b[restrict N], float complex c[restrict N]) { for (int i=0; i < N; i++) c[i] = a[i] + (b[i] * I); } generates f90: mov x3, 0 .p2align 3,,7 .L2: ldr q0, [x0, x3] ldr q1, [x1, x3] fcadd v0.4s, v0.4s, v1.4s, #90 str q0, [x2, x3] add x3, x3, 16 cmp x3, 1600 bne .L2 ret instead of f90: add x3, x1, 1600 .p2align 3,,7 .L2: ld2 {v4.4s - v5.4s}, [x0], 32 ld2 {v2.4s - v3.4s}, [x1], 32 fsub v0.4s, v4.4s, v3.4s fadd v1.4s, v5.4s, v2.4s st2 {v0.4s - v1.4s}, [x2], 32 cmp x3, x1 bne .L2 ret gcc/ChangeLog: * config/aarch64/aarch64-simd.md (cadd<rot><mode>3): New. * config/aarch64/iterators.md (SVE2_INT_CADD_OP): New. * config/aarch64/aarch64-sve.md (cadd<rot><mode>3): New. * config/aarch64/aarch64-sve2.md (cadd<rot><mode>3): New.
2020-12-16Fix instruction length for MMA insns.Pat Haugen1-21/+11
Prefixed instructions should not have their length explicitly set to '8'. The function get_attr_length() will adjust the length appropriately based on the value of the "prefixed" attribute. 2020-12-16 Pat Haugen <pthaugen@linux.ibm.com> gcc/ * config/rs6000/mma.md (*movxo, mma_<vvi4i4i8>, mma_<avvi4i4i8>, mma_<vvi4i4i2>, mma_<avvi4i4i2>, mma_<vvi4i4>, mma_<avvi4i4>, mma_<pvi4i2>, mma_<apvi4i2>, mma_<vvi4i4i4>, mma_<avvi4i4i4>): Remove explicit setting of length attribute.
2020-12-16opts: Remove all usages of Report keyword.Martin Liska61-1083/+1083
gcc/brig/ChangeLog: * lang.opt: Remove usage of Report. gcc/c-family/ChangeLog: * c.opt: Remove usage of Report. gcc/ChangeLog: * common.opt: Remove usage of Report. * config/aarch64/aarch64.opt: Ditto. * config/alpha/alpha.opt: Ditto. * config/arc/arc.opt: Ditto. * config/arm/arm.opt: Ditto. * config/avr/avr.opt: Ditto. * config/bfin/bfin.opt: Ditto. * config/bpf/bpf.opt: Ditto. * config/c6x/c6x.opt: Ditto. * config/cr16/cr16.opt: Ditto. * config/cris/cris.opt: Ditto. * config/cris/elf.opt: Ditto. * config/csky/csky.opt: Ditto. * config/darwin.opt: Ditto. * config/fr30/fr30.opt: Ditto. * config/frv/frv.opt: Ditto. * config/ft32/ft32.opt: Ditto. * config/gcn/gcn.opt: Ditto. * config/i386/cygming.opt: Ditto. * config/i386/i386.opt: Ditto. * config/ia64/ia64.opt: Ditto. * config/ia64/ilp32.opt: Ditto. * config/linux-android.opt: Ditto. * config/linux.opt: Ditto. * config/lm32/lm32.opt: Ditto. * config/m32r/m32r.opt: Ditto. * config/m68k/m68k.opt: Ditto. * config/mcore/mcore.opt: Ditto. * config/microblaze/microblaze.opt: Ditto. * config/mips/mips.opt: Ditto. * config/mmix/mmix.opt: Ditto. * config/mn10300/mn10300.opt: Ditto. * config/moxie/moxie.opt: Ditto. * config/msp430/msp430.opt: Ditto. * config/nds32/nds32.opt: Ditto. * config/nios2/elf.opt: Ditto. * config/nios2/nios2.opt: Ditto. * config/nvptx/nvptx.opt: Ditto. * config/pa/pa.opt: Ditto. * config/pdp11/pdp11.opt: Ditto. * config/pru/pru.opt: Ditto. * config/riscv/riscv.opt: Ditto. * config/rl78/rl78.opt: Ditto. * config/rs6000/aix64.opt: Ditto. * config/rs6000/linux64.opt: Ditto. * config/rs6000/rs6000.opt: Ditto. * config/rs6000/sysv4.opt: Ditto. * config/rx/elf.opt: Ditto. * config/rx/rx.opt: Ditto. * config/s390/s390.opt: Ditto. * config/s390/tpf.opt: Ditto. * config/sh/sh.opt: Ditto. * config/sol2.opt: Ditto. * config/sparc/long-double-switch.opt: Ditto. * config/sparc/sparc.opt: Ditto. * config/tilegx/tilegx.opt: Ditto. * config/tilepro/tilepro.opt: Ditto. * config/v850/v850.opt: Ditto. * config/visium/visium.opt: Ditto. * config/vms/vms.opt: Ditto. * config/vxworks.opt: Ditto. * config/xtensa/xtensa.opt: Ditto. gcc/lto/ChangeLog: * lang.opt: Remove usage of Report.
2020-12-16rs6000: Use subreg for QI/HI vector initKewen Lin1-11/+3
This patch is to use paradoxical subreg instead of zero_extend for promoting QI/HI to SI/DI when we want to construct one vector with these modes. Since we do the gpr->vsx movement and vector merge or pack later, the high part is useless and safe to use paradoxical subreg. It can avoid useless rlwinms generated for signed cases. Bootstrapped/regtested on powerpc64le-linux-gnu P9. gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_expand_vector_init): Use paradoxical subreg instead of zero_extend for QI/HI promotion. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr96933-1.c: Adjusted to check no rlwinm. * gcc.target/powerpc/pr96933-2.c: Likewise.
2020-12-16arm: Replace calls to __builtin_vcgt* by <,> in arm_neon.h [PR66791]Prathamesh Kulkarni2-30/+28
gcc/ 2020-12-16 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR target/66791 * config/arm/arm_neon.h: Replace calls to __builtin_vcgt* by <, > operators in vclt and vcgt intrinsics respectively. * config/arm/arm_neon_builtins.def: Remove entry for vcgt and vcgtu.
2020-12-16arm: Replace calls to __builtin_vneg* by - in arm_neon.h [PR66791]Prathamesh Kulkarni2-9/+8
gcc/ 2020-12-16 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR target/66791 * config/arm/arm_neon.h: Replace calls to __builtin_vneg* by - operator in vneg intrinsics. * config/arm/arm_neon_builtins.def: Remove entry for vneg.
2020-12-16arm: Replace calls to __builtin_vcreate* in arm_neon.h [PR66791]Prathamesh Kulkarni2-12/+11
gcc/ 2020-12-16 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> PR target/66791 * config/arm/arm_neon.h: Replace calls to __builtin_vcreate* in vcreate intrinsics. * config/arm/arm_neon_builtins.def: Remove entry for vcreate.
2020-12-15i386: Fix up -march=x86-64-v[234] vs. target attribute [PR98274]Jakub Jelinek1-7/+7
The following testcase fails to compile. The problem is that when ix86_option_override_internal is called the first time for command line, it sees -mtune= wasn't present on the command line and so as fallback sets ix86_tune_string to ix86_arch_string value ("x86-64-v2"), but ix86_tune_specified is false, so we don't find the tuning in the table but don't error on it. When processing the target attribute, ix86_tune_string is what it was earlier left with, but this time ix86_tune_specified is true and so we error on it. The following patch does what is done already e.g. for "x86-64" march, in particular default the tuning to "generic". 2020-12-15 Jakub Jelinek <jakub@redhat.com> PR target/98274 * config/i386/i386-options.c (ix86_option_override_internal): Set ix86_tune_string to "generic" even when it wasn't specified and ix86_arch_string is "x86-64-v2", "x86-64-v3" or "x86-64-v4". Remove useless {}s around a single statement. * gcc.target/i386/pr98274.c: New test.
2020-12-15i386: Make -march=x86-64-v[234] behave more like other -march= optionsJakub Jelinek1-11/+0
If somebody has -march=x86-64-v2 (or -v3 or -v4) in $CFLAGS, $CXXFLAGS etc., then -m32 or -mabi=ms stops working. What is worse, if one configures gcc --with-arch-64=x86-64-v2 (or -v3 or -v4), then -mabi=ms stops working. I think that is a nightmare user experience. It is ok that x86-64-v[234] behave slightly different from other -march= options (in that they imply unless overridden -mtune=generic rather then -mtune= equal to the -march argument), but the error when one mixes it with -mabi=ms, or -m32 doesn't improve anything. It is true that the exact option set is only defined in the x86-64 psABI (IMHO that is a mistake too, we should copy that into the GCC documentation like we document it for any other -march= option), but there is no reason why that exact set of CPU features can't be used for other ABIs, it is just a set of CPU features. If we add micro-architecture levels to the 32-bit ABI (I doubt anyone wants to do that, but just hypothetically), then those micro-architecture levels wouldn't certainly be called x86-64-v* but perhaps i386-v*. In the tests, __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 can't be expected on -m32 not because the CPU feature wouldn't be set, but because the instruction is 64-bit only and 32-bit code doesn't have __int128 etc. support. 2020-12-15 Jakub Jelinek <jakub@redhat.com> * config/i386/i386-options.c (ix86_option_override_internal): Don't error on -march=x86-64-v[234] with -m32 or -mabi=ms. * config.gcc: Don't reject --with-arch=x86-64-v[234] or --with-arch_32=x86-64-v[234]. * doc/invoke.texi (-march=x86-64-v[234]): Document what the option does for other ABIs. * gcc.target/i386/x86-64-v2.c: Don't expect __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32. * gcc.target/i386/x86-64-v2-other.c: New test. * gcc.target/i386/x86-64-v2-msabi.c: New test. * gcc.target/i386/x86-64-v3.c: Fix a comment pasto. Don't expect __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32. * gcc.target/i386/x86-64-v3-other.c: New test. * gcc.target/i386/x86-64-v3-msabi.c: New test. * gcc.target/i386/x86-64-v4.c:Don't expect __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 to be defined with -m32. * gcc.target/i386/x86-64-v4-other.c: New test. * gcc.target/i386/x86-64-v4-msabi.c: New test.
2020-12-14gcc: xtensa: fix PR target/98285Max Filippov2-9/+14
2020-12-14 Max Filippov <jcmvbkbc@gmail.com> gcc/ * config/xtensa/predicates.md (addsubx_operand): Change accepted values from 2/4/8 to 1..3. * config/xtensa/xtensa.md (*addx, *subx): Change RTL pattern to use 'ashift' instead of 'mult'. Update operands[3] value. gcc/testsuite/ * gcc.target/xtensa/pr98285.c: New test.
2020-12-15rs6000: Update the processor defaults for FreeBSDGerald Pfeifer1-3/+2
gcc/ChangeLog: 2020-12-13 Piotr Kubaj <pkubaj@FreeBSD.org> Gerald Pfeifer <gerald@pfeifer.com> * config/rs6000/freebsd64.h (PROCESSOR_DEFAULT): Update to PROCESSOR_PPC7450. (PROCESSOR_DEFAULT64): Update to PROCESSOR_POWER8.
2020-12-14AArch64: Add support for --with-tuneWilco Dijkstra1-4/+6
Add support for --with-tune. Like --with-cpu and --with-arch, the argument is validated and transformed into a -mtune option to be processed like any other command-line option. --with-tune has no effect if a -mcpu or -mtune option is used. The validating code didn't allow --with-cpu=native, so explicitly allow that. Co-authored-by: Delia Burduv <delia.burduv@arm.com> Bootstrap OK, regress pass, OK to commit? 2020-09-03 Wilco Dijkstra <wdijkstr@arm.com> gcc/ * config.gcc (aarch64*-*-*): Add --with-tune. Support --with-cpu=native. * config/aarch64/aarch64.h (OPTION_DEFAULT_SPECS): Add --with-tune. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_tune_cortex_a76): New effective target test. * gcc.target/aarch64/with-tune-config.c: New test. * gcc.target/aarch64/with-tune-march.c: Likewise. * gcc.target/aarch64/with-tune-mcpu.c: Likewise. * gcc.target/aarch64/with-tune-mtune.c: Likewise.
2020-12-14arm: Auto-vectorization for MVE: vnegChristophe Lyon4-12/+14
This patch enables MVE vneg instructions for auto-vectorization. MVE vnegq insns in mve.md are modified to use 'neg' instead of unspec expression. The neg<mode>2 expander is added to vec-common.md. Existing patterns in neon.md are prefixed with neon_. It's not clear why we have different patterns for VDQW and VH in neon.md, when WDQWH handles both, and patterns with VDQ have provision for attributes for FP modes. Another question is why <absneg_str><mode>2 always sets neon_abs<q> type when it also handles neon_neq<q> cases. 2020-12-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/mve.md (mve_vnegq_f): Use 'neg' instead of unspec. (mve_vnegq_s): Likewise. * config/arm/neon.md (neg<mode>2): Rename into neon_neg<mode>2. (<absneg_str><mode>2): Rename into neon_<absneg_str><mode>2. (neon_v<absneg_str><mode>): Call gen_neon_<absneg_str><mode>2. (vashr<mode>3): Call gen_neon_neg<mode>2. (vlshr<mode>3): Call gen_neon_neg<mode>2. (neon_vneg<mode>): Call gen_neon_neg<mode>2. * config/arm/unspecs.md (VNEGQ_F, VNEGQ_S): Remove. * config/arm/vec-common.md (neg<mode>2): New expander. gcc/testsuite/ * gcc.target/arm/simd/mve-vneg.c: Add tests for vneg.
2020-12-14arm: Auto-vectorization for MVE: vmvnChristophe Lyon5-10/+19
This patch enables MVE vmvnq instructions for auto-vectorization. MVE vmvnq insns in mve.md are modified to use 'not' instead of unspec expression to support one_cmpl<mode>2. The one_cmpl<mode>2 expander is added to vec-common.md. 2020-12-11 Christophe Lyon <christophe.lyon@linaro.org> gcc/ * config/arm/iterators.md (VDQNOTM2): New mode iterator. (supf): Remove VMVNQ_S and VMVNQ_U. (VMVNQ): Remove. * config/arm/mve.md (mve_vmvnq_u<mode>): New entry for vmvn instruction using expression not. (mve_vmvnq_s<mode>): New expander. * config/arm/neon.md (one_cmpl<mode>2): Renamed into one_cmpl<mode>2_neon. * config/arm/unspecs.md (VMVNQ_S, VMVNQ_U): Remove. * config/arm/vec-common.md (one_cmpl<mode>2): New expander. gcc/testsuite/ * gcc.target/arm/simd/mve-vmvn.c: Add tests for vmvn.