aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-08-20Testsuite: fix analyzer tests on DarwinFrancois-Xavier Coudert9-0/+16
On macOS, system headers redefine by default some macros (memcpy, memmove, etc) to checked versions, which defeats the analyzer. We want to turn this off. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104042 gcc/testsuite/ChangeLog: PR analyzer/104042 * gcc.dg/analyzer/analyzer.exp: Pass -D_FORTIFY_SOURCE=0 on Darwin. * gcc.dg/analyzer/fd-bind.c: Add missing <string.h> header. * gcc.dg/analyzer/fd-datagram-socket.c: Likewise. * gcc.dg/analyzer/fd-listen.c: Likewise. * gcc.dg/analyzer/fd-socket-misuse.c: Likewise. * gcc.dg/analyzer/fd-stream-socket-active-open.c: Likewise. * gcc.dg/analyzer/fd-stream-socket-passive-open.c: Likewise. * gcc.dg/analyzer/fd-stream-socket.c: Likewise. * gcc.dg/analyzer/fd-symbolic-socket.c: Likewise.
2023-08-20MATCH: Sink convert for vec_condAndrew Pinski2-0/+31
Convert be sinked into a vec_cond if both sides fold. Unlike other unary operations, we need to check that we still can handle this vec_cond's first operand is the same as the new truth type. I tried a few different versions of this patch: view_convert to the new truth_type but that does not work as we always support all vec_cond afterwards. using expand_vec_cond_expr_p; but that would allow too much. I also tried to see if view_convert can be handled here but we end up with: _3 = VEC_COND_EXPR <_2, { Nan(-1), Nan(-1), Nan(-1), Nan(-1) }, { 0.0, 0.0, 0.0, 0.0 }>; Which isel does not know how to handle as just being a view_convert from `vector(4) <signed-boolean:32>` to `vector(4) float` and causes a regression with `g++.target/i386/pr88152.C` Note, in the case of the SVE testcase, we will sink negate after the convert and be able to remove a few extra instructions in the end. Also with this change gcc.target/aarch64/sve/cond_unary_5.c will now pass. Committed as approved after a bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu. gcc/ChangeLog: PR tree-optimization/111006 PR tree-optimization/110986 * match.pd: (op(vec_cond(a,b,c))): Handle convert for op. gcc/testsuite/ChangeLog: PR tree-optimization/111006 * gcc.target/aarch64/sve/cond_convert_7.c: New test.
2023-08-20fix misleading identation breaking bootstrapMartin Uecker1-2/+2
Fix identation issue introduced by 966f3c13 "Fix format attribute for printf". gcc/c-family/ChangeLog: * c-format.cc: Fix identation.
2023-08-19improve error when /usr/include isn't found [PR90835]Eric Gallager1-1/+7
This is a pretty simple patch that ought to help Darwin users understand better why their build is failing when they forget to pass the --with-sysroot= flag to configure. gcc/ChangeLog: PR target/90835 * Makefile.in: improve error message when /usr/include is missing
2023-08-20Fix format attribute for printfTomas Kalibera1-0/+36
Since a long time (GCC 4.4?) GCC does support annotating functions with either the format attribute "gnu_printf" or "ms_printf" to distinguish between different format string interpretations. However, it seems like the attribute is ignored for the "printf" symbol; regardless what the function declaration says, GCC treats it as "ms_printf". This has become an issue now that mingw-w64 supports using the UCRT instead of msvcrt.dll, and in this case the stdio functions are declared with the gnu_printf attribute, and inttypes.h uses the same format specifiers as in GNU mode. A reproducible example of the problem: $ cat format.c __attribute__((__format__ (gnu_printf, 1, 2))) int printf (const char *__format, ...); __attribute__((__format__ (gnu_printf, 1, 2))) int othername (const char *__format, ...); void function(void) { long long unsigned x = 42; othername("%llu\n", x); printf("%llu\n", x); } $ x86_64-w64-mingw32-gcc -c -Wformat format.c format.c: In function 'function': format.c:7:15: warning: unknown conversion type character 'l' in format [-Wformat=] 7 | printf("%llu\n", x); | ^ format.c:7:12: warning: too many arguments for format [-Wformat-extra-args] 7 | printf("%llu\n", x); | ^~~~~~~~ Note how both functions, printf and othername, are declare with identical gnu_printf format attributes - GCC does take this into account for "othername" and doesn't produce a warning, but GCC seems to disregard the attribute in the printf declaration and behave as if it was declared as ms_printf. If the printf function declaration is changed into a static inline function, the actual attribute used is honored though. gcc/c-family/ChangeLog: PR c/95130 * c-format.cc: skip default format for printf symbol if explicitly declared by prototype. Signed-off-by: Tomas Kalibera <tomas.kalibera@gmail.com> Signed-off-by: Jonathan Yong <10walls@gmail.com>
2023-08-20Daily bump.GCC Administrator2-1/+15
2023-08-19omp-expand.cc: Fix wrong code with non-rectangular loop nest [PR111017]Tobias Burnus1-1/+2
Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in expand_omp_for_* were inserted with gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT); except the one dealing with the multiplicative factor that was gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING); That commit for PR103208 fixed the issue of some missing regimplify of operands of GIMPLE_CONDs by moving the condition handling to the new function expand_omp_build_cond. While that function has an 'bool after = false' argument to switch between the two variants. However, all callers ommited this argument. This commit reinstates the prior behavior by passing 'true' for the factor != 0 condition, fixing the included testcase. PR middle-end/111017 gcc/ * omp-expand.cc (expand_omp_for_init_vars): Pass after=true to expand_omp_build_cond for 'factor != 0' condition, resulting in pre-r12-5295-g47de0b56ee455e code for the gimple insert. libgomp/ * testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.
2023-08-19Loongarch: Fix plugin header missing install.Guo Jie1-0/+4
gcc/ChangeLog: * config/loongarch/t-loongarch: Add loongarch-driver.h into TM_H. Add loongarch-def.h and loongarch-tune.h into OPTIONS_H_EXTRA. Co-authored-by: Lulu Cheng <chenglulu@loongson.cn>
2023-08-19Daily bump.GCC Administrator3-1/+151
2023-08-18testsuite: Improve test in dg-require-python-hThiago Jung Bauermann1-2/+12
If GCC is tested with a sysroot which doesn't contain a Python installation (e.g., with a command such as "make check-gcc-c FLAGS_UNDER_TEST="--sysroot=/some/path"), but there's a python3-config in $PATH, then the testsuite will pick up the host's Python.h which can't actually be used: Executing on host: python3-config --includes (timeout = 300) spawn -ignore SIGHUP python3-config --includes -I/usr/include/python3.10 -I/usr/include/python3.10 Executing on host: /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc --sysroot=/some/sysroot/libc -Wl,-dynamic-linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/some/sysroot/libc/lib /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c -fdiagnostics-plain-output -fplugin=./analyzer_cpython_plugin.so -fanalyzer -I/usr/include/python3.10 -I/usr/include/python3.10 -S -o cpython-plugin-test-2.s (timeout = 600) spawn -ignore SIGHUP /some/sysroot/bin/aarch64-unknown-linux-gnu-gcc --sysroot=/some/sysroot/libc -Wl,-dynamic-linker=/some/sysroot/libc/lib/ld-linux-aarch64.so.1 -Wl,-rpath=/some/sysroot/libc/lib /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c -fdiagnostics-plain-output -fplugin=./analyzer_cpython_plugin.so -fanalyzer -I/usr/include/python3.10 -I/usr/include/python3.10 -S -o cpython-plugin-test-2.s In file included from /usr/include/python3.10/Python.h:8, from /some/src/gcc.git/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c:8: /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-gnu/python3.10/pyconfig.h: No such file or directory compilation terminated. compiler exited with status 1 This problem causes these testsuite failures: FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 17) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 18) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 21) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 31) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 32) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 35) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 45) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 55) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 63) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 66) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 68) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for warnings, line 69) FAIL: gcc.dg/plugin/cpython-plugin-test-2.c -fplugin=./analyzer_cpython_plugin.so (test for excess errors) Excess errors: /usr/include/python3.10/pyconfig.h:9:12: fatal error: aarch64-linux-gnu/python3.10/pyconfig.h: No such file or directory compilation terminated. So try to compile a test file so that the testcase can be marked as unsupported instead. gcc/testsuite/ChangeLog: * lib/target-supports.exp (dg-require-python-h): Test whether Python.h can really be used.
2023-08-18i386: Use PUNPCKL?? to implement vector extend and zero_extend for TARGET_SSE2.Uros Bizjak8-8/+234
Implement vector extend and zero_extend functionality for TARGET_SSE2 using PUNPCKL?? family of instructions. The code for e.g. zero-extend from V2SI to V2DImode improves from: movd %xmm0, %edx pshufd $85, %xmm0, %xmm0 movd %xmm0, %eax movq %rdx, (%rdi) movq %rax, 8(%rdi) to: pxor %xmm1, %xmm1 punpckldq %xmm1, %xmm0 movaps %xmm0, (%rdi) And the code for sign-extend from V2SI to V2DImode from: movd %xmm0, %edx pshufd $85, %xmm0, %xmm0 movd %xmm0, %eax movslq %edx, %rdx cltq movq %rdx, (%rdi) movq %rax, 8(%rdi) to: pxor %xmm1, %xmm1 pcmpgtd %xmm0, %xmm1 punpckldq %xmm1, %xmm0 movaps %xmm0, (%rdi) PR target/111023 gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_split_mmx_punpck): Also handle V2QImode. (ix86_expand_sse_extend): New function. * config/i386/i386-protos.h (ix86_expand_sse_extend): New prototype. * config/i386/mmx.md (<any_extend:insn>v4qiv4hi2): Enable for TARGET_SSE2. Expand through ix86_expand_sse_extend for !TARGET_SSE4_1. (<any_extend:insn>v2hiv2si2): Ditto. (<any_extend:insn>v2qiv2hi2): Ditto. * config/i386/sse.md (<any_extend:insn>v8qiv8hi2): Ditto. (<any_extend:insn>v4hiv4si2): Ditto. (<any_extend:insn>v2siv2di2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr111023-2.c: New test. * gcc.target/i386/pr111023-4b.c: New test. * gcc.target/i386/pr111023-8b.c: New test. * gcc.target/i386/pr111023.c: New test.
2023-08-18[irange] Return FALSE if updated bitmask is unchanged [PR110753]Aldy Hernandez2-0/+33
The mask/value pair we track in the irange is a bit fickle in that it can sometimes contradict the bitmask inherent in the range. This can happen when a series of calculations yield a combination such as: [3, 1000] MASK 0xfffffffe VALUE 0x0 The mask/value above implies that the lowest bit is a known 0, which would exclude the 3 in the range. At one time we tried keeping mask and ranges 100% consistent, but the performance penalty was too high (5% in VRP). Also, it's unclear whether the intersection of two incompatible known bits should make the whole range undefined, or just the contradicting bits. This is all documented in irange::get_bitmask(). We could revisit both of these assumptions in the future. In this testcase IPA ends up with a range where the lower 2 bits are expected to be 0, but the range is [1,1]. [irange] long int [1, 1] MASK 0xfffffffffffffffc VALUE 0x0 This causes irange::union_bitmask() to think an update occurred, when no semantic change happened, thus triggering an assert in IPA-cp. We could get rid of the assert, but it's cleaner to make irange::{union,intersect}_bitmask always tell the truth. Beside, the ranger's cache also depends on union being truthful. PR ipa/110753 gcc/ChangeLog: * value-range.cc (irange::union_bitmask): Return FALSE if updated bitmask is semantically equivalent to the original mask. (irange::intersect_bitmask): Same. (irange::get_bitmask): Add comment. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr110753.c: New test.
2023-08-18tree-optimization/111019 - invariant motion and aliasingRichard Biener2-2/+77
The following fixes a bad choice in representing things to the alias oracle by LIM which while correct in pieces is inconsistent with itself. When canonicalizing a ref to a bare deref instead of leaving the base object and the extracted offset the same and just substituting an alternate ref the following replaces the base and the offset as well, avoiding the confusion that otherwise will arise in aliasing_matching_component_refs_p. PR tree-optimization/111019 * tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing also scrap base and offset in case the ref is indirect. * g++.dg/torture/pr111019.C: New testcase.
2023-08-18bpf: bump maximum frame size limit to 32767 bytesJose E. Marchesi3-1/+35
This commit bumps the maximum stack frame size allowed for BPF functions to the maximum possible value. Tested in x86_64-linux-gnu host and target bpf-unknown-none. gcc/ChangeLog * config/bpf/bpf.opt (mframe-limit): Set default to 32767. gcc/testsuite/ChangeLog * gcc.target/bpf/frame-limit-1.c: New test. * gcc.target/bpf/frame-limit-2.c: Likewise.
2023-08-18Makefile.in: Make TM_P_H depend on $(TREE_H) [PR111021]Kewen Lin1-1/+2
As PR111021 shows, the below ${port}-protos.h include tree.h for code_helper and tree_code: arm/arm-protos.h:#include "tree.h" cris/cris-protos.h:#include "tree.h" (H-P removed this in r14-3218) microblaze/microblaze-protos.h:#include "tree.h" rl78/rl78-protos.h:#include "tree.h" stormy16/stormy16-protos.h:#include "tree.h" , when compiling build/gencondmd.cc, the include hierarchy makes it depend on tm_p.h -> ${port}-protos.h -> tree.h, which further includes (depends on) some files that are generated during the building, such as: all-tree.def, tree-check.h and so on. The previous commit r14-3215 should already force build/gencondmd.cc to depend on ${TREE_H}, so the reported build failure should be gone. But for a long term maintenance, especially one day some build/xxx.cc requires tm_p.h but not recog.h, the ${TREE_H} dependence could be missed and a build failure will show up. So this patch is to make TM_P_H depend on $(TREE_H), any new build/xxx.cc depending on tm_p.h will be able to consider ${TREE_H}. It's tested with cross-builds for the affected ports with steps: 1) dropped the fix r14-3215; 2) reproduced the build failure with serial build; 3) applied this patch, serial built and verified all passed; 4) added back r14-3215, serial built and verified all passed; PR bootstrap/111021 gcc/ChangeLog: * Makefile.in (TM_P_H): Add $(TREE_H) as dependence.
2023-08-18vect: Factor out the handling on scatter store having gs_info.declKewen Lin1-199/+212
Similar to the existing function vect_build_gather_load_calls, this patch is to factor out the handling on scatter store having gs_info.decl to vect_build_scatter_store_calls which is a new function. It also does some minor refactoring like moving some variables' declarations close to their uses and restrict the scope for some of them etc. It's a pre-patch for upcoming vectorizable_store re-structuring for costing. gcc/ChangeLog: * tree-vect-stmts.cc (vect_build_scatter_store_calls): New, factor out from ... (vectorizable_store): ... here.
2023-08-18tree-optimization/111048 - avoid flawed logic in fold_vec_permRichard Biener2-6/+30
The following avoids running into somehow flawed logic in fold_vec_perm for non-VLA vectors. PR tree-optimization/111048 * fold-const.cc (fold_vec_perm_cst): Check for non-VLA vectors first. * gcc.dg/torture/pr111048.c: New testcase.
2023-08-18i386: Add AVX2 pragma wrapper for AVX512DQVL intrinsHaochen Jiang2-0/+22
PR target/111051 gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Push AVX2 when AVX2 is disabled. gcc/testsuite/ChangeLog: PR target/111051 * gcc.target/i386/pr111051-1.c: New test.
2023-08-18vect: Move VMAT_GATHER_SCATTER handlings from final loop nestKewen Lin1-142/+216
Following Richi's suggestion [1], this patch is to move the handlings on VMAT_GATHER_SCATTER in the final loop nest of function vectorizable_load to its own loop. Basically it duplicates the final loop nest, clean up some useless set up code for the case of VMAT_GATHER_SCATTER, remove some unreachable code. Also remove the corresponding handlings in the final loop nest. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623329.html gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Move the handlings on VMAT_GATHER_SCATTER in the final loop nest to its own loop, and update the final nest accordingly.
2023-08-18RISC-V: Fix -march error of zhinxmin testcasesLehua Ding2-3/+4
This little patch fixs the -march error of a zhinxmin testcase I added earlier and an old zhinxmin testcase, since these testcases are for zhinxmin extension and not zfhmin extension. gcc/testsuite/ChangeLog: * gcc.target/riscv/_Float16-zhinxmin-3.c: Adjust. * gcc.target/riscv/_Float16-zhinxmin-4.c: Ditto.
2023-08-17Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl ↵Andrew Pinski1-0/+62
standard patterns When I added `cond_one_cmpl` (and the corresponding IFN) I had noticed cond_neg standard named pattern was not documented and this adds the documentation for all 4 named patterns now. OK? Tested by building the manual. gcc/ChangeLog: * doc/md.texi (Standard patterns): Document cond_neg, cond_one_cmpl, cond_len_neg and cond_len_one_cmpl.
2023-08-18RISC-V: Add the missed half floating-point mode patterns of ↵Lehua Ding4-17/+44
local_pic_load/store when only use zfhmin or zhinxmin Hi, There is a new failed RISC-V testcase(testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c) on the current trunk branch when use medany as default cmodel. The reason is the load of half floating-point imm is convert from RTL 1 to RTL 2 as the cmodel be changed from medlow to medany. This change let insn 7 be combineed with @pred_broadcast patterns (insn 8) at combine pass. However, insn 6 and insn 7 are combined for SF and DF mode, but not for HF mode, and the fail combined leads to insn 7 and insn 8 be combined. The reason of the fail combined is the local_pic_loadhf pattern doesn't exist when only enable zfhmin(implied by zvfh). Therefore, when only zfhmin but not zfh is enabled, the define_insn of *local_pic_load<ANYF:mode> must also be able to produce the pattern for *load_pic_loadhf pattern, since the zfhmin extension also includes a half floating-point load/store instructions. So, I added an ANFLSF Iterator and applied it to local_pic_load/store define_insns. I have checked other ANYF usage scenarios and feel that this is the only place that needs to be corrected. I may have missed something, please correct. Thanks. RTL 1: (insn 6 3 7 2 (set (reg:DI 137) (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {*movdi_64bit} (nil)) (insn 7 6 8 2 (set (reg:HF 136) (mem/u/c:HF (lo_sum:DI (reg:DI 137) (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0 S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {*movhf_hardfloat} (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4]) (nil))) RTL 2: (insn 6 3 7 2 (set (reg/f:DI 137) (symbol_ref/u:DI ("*.LC0") [flags 0x82])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 179 {*movdi_64bit} (nil)) (insn 7 6 8 2 (set (reg:HF 136) (mem/u/c:HF (reg/f:DI 137) [0 S2 A16])) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":7:1 discrim 3 126 {*movhf_hardfloat} (expr_list:REG_EQUAL (const_double:HF 8.8828125e+0 [0x0.8e2p+4]) (nil))) (insn 8 7 9 2 (set (reg:V2HF 135) (if_then_else:V2HF (unspec:V2BI [ (const_vector:V2BI [ (const_int 1 [0x1]) repeated x2 ]) (const_int 2 [0x2]) repeated x3 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (vec_duplicate:V2HF (reg:HF 136)) (unspec:V2HF [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) "/work/home/lding/open-source/riscv-gnu-toolchain-push/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/const-4.c":6:1 discrim 3 1389 {*pred_broadcastv2hf} (nil)) Best, Lehua gcc/ChangeLog: * config/riscv/iterators.md (TARGET_HARD_FLOAT || TARGET_ZFINX): New. * config/riscv/pic.md (*local_pic_load<ANYF:mode>): Change ANYF. (*local_pic_load<ANYLSF:mode>): To ANYLSF. (*local_pic_load_32d<ANYF:mode>): Ditto. (*local_pic_load_32d<ANYLSF:mode>): Ditto. (*local_pic_store<ANYF:mode>): Ditto. (*local_pic_store<ANYLSF:mode>): Ditto. (*local_pic_store_32d<ANYF:mode>): Ditto. (*local_pic_store_32d<ANYLSF:mode>): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/_Float16-zfhmin-4.c: New test. * gcc.target/riscv/_Float16-zhinxmin-4.c: New test.
2023-08-18RISC-V: Revert the convert from vmv.s.x to vmv.v.iLehua Ding4-19/+70
Hi, This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern optimize the special case when the scalar operand is zero. Currently, the broadcast pattern where the scalar operand is a imm will be converted to vmv.v.i from vmv.s.x and the mask operand will be converted from 00..01 to 11..11. There are some advantages and disadvantages before and after the conversion after discussing with Juzhe offline and we chose not to do this transform. Before: Advantages: The vsetvli info required by vmv.s.x has better compatibility since vmv.s.x only required SEW and VLEN be zero or one. That mean there is more opportunities to combine with other vsetlv infos in vsetvl pass. Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction will be needed. After: Advantages: No need `li rd, imm` instruction since vmv.v.i support imm operand. Disadvantages: Like before's advantages. Worse compatibility leads to more vsetvl instrunctions need. Consider the bellow C code and asm after autovec. there is an extra insn (vsetivli zero, 1, e32, m1, ta, ma) after converted vmv.s.x to vmv.v.i. ``` int foo1(int* restrict a, int* restrict b, int *restrict c, int n) { int sum = 0; for (int i = 0; i < n; i++) sum += a[i] * b[i]; return sum; } ``` asm (Before): ``` foo1: ble a3,zero,.L7 vsetvli a2,zero,e32,m1,ta,ma vmv.v.i v1,0 .L6: vsetvli a5,a3,e32,m1,tu,ma slli a4,a5,2 sub a3,a3,a5 vle32.v v2,0(a0) vle32.v v3,0(a1) add a0,a0,a4 add a1,a1,a4 vmacc.vv v1,v3,v2 bne a3,zero,.L6 vsetvli a2,zero,e32,m1,ta,ma vmv.s.x v2,zero vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L7: li a0,0 ret ``` asm (After): ``` foo1: ble a3,zero,.L4 vsetvli a2,zero,e32,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a3,e32,m1,tu,ma slli a4,a5,2 sub a3,a3,a5 vle32.v v2,0(a0) vle32.v v3,0(a1) add a0,a0,a4 add a1,a1,a4 vmacc.vv v1,v3,v2 bne a3,zero,.L3 vsetivli zero,1,e32,m1,ta,ma vmv.v.i v2,0 vsetvli a2,zero,e32,m1,ta,ma vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret ``` Best, Lehua Co-Authored-By: Ju-Zhe Zhong <juzhe.zhong@rivai.ai> gcc/ChangeLog: * config/riscv/predicates.md (vector_const_0_operand): New. * config/riscv/vector.md (*pred_broadcast<mode>_zero): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-5.c: Update. * gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
2023-08-18RISC-V: Forbidden fuse vlmax vsetvl to DEMAND_NONZERO_AVL vsetvlLehua Ding2-0/+23
Hi, This little patch fix the fail testcase (gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c) after apply this patch (https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627121.html). The specific reason is that the vsetvl pass has bug and this patch forbidden the fuse of this case. This patch needs to be committed before that patch to work. Best, Lehua gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): Forbidden. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: Address failure due to uninitialized vtype register.
2023-08-18Daily bump.GCC Administrator5-1/+542
2023-08-17Fix range-ops operator_addr.Andrew MacLeod2-1/+49
Lack of symbolic information prevents op1_range from beig able to draw the same conclusions as fold_range can. PR tree-optimization/111009 gcc/ * range-op.cc (operator_addr_expr::op1_range): Be more restrictive. gcc/testsuite/ * gcc.dg/pr111009.c: New.
2023-08-17RISCV: Add rotate immediate regression testPatrick O'Neill2-0/+40
This adds new regression tests to ensure half-register rotations are correctly optimized into rori instructions. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-rol-ror-08.c: New test. * gcc.target/riscv/zbb-rol-ror-09.c: New test. Co-authored-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
2023-08-17[LRA]: When assigning stack slots to pseudos previously assigned to fp ↵Vladimir N. Makarov1-1/+2
consider other spilled pseudos The previous LRA patch can assign slot of conflicting pseudos to pseudos spilled after prohibiting fp->sp elimination. This patch fixes this problem. gcc/ChangeLog: * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Moving slots_num initialization from here ... (lra_spill): ... to here before the 1st call of assign_stack_slot_num_and_sort_pseudos. Add the 2nd call after fp->sp elimination.
2023-08-17Add warning options -W[no-]compare-distinct-pointer-typesJose E. Marchesi6-3/+111
GCC emits pedwarns unconditionally when comparing pointers of different types, for example: int xdp_context (struct xdp_md *xdp) { void *data = (void *)(long)xdp->data; __u32 *metadata = (void *)(long)xdp->data_meta; __u32 ret; if (metadata + 1 > data) return 0; return 1; } /home/jemarch/foo.c: In function ‘xdp_context’: /home/jemarch/foo.c:15:20: warning: comparison of distinct pointer types lacks a cast 15 | if (metadata + 1 > data) | ^ LLVM supports an option -W[no-]compare-distinct-pointer-types that can be used in order to enable or disable the emission of such warnings. It is enabled by default. This patch adds the same options to GCC. Documentation and testsuite updated included. Regtested in x86_64-linu-gnu. No regressions observed. gcc/ChangeLog: PR c/106537 * doc/invoke.texi (Option Summary): Mention -Wcompare-distinct-pointer-types under `Warning Options'. (Warning Options): Document -Wcompare-distinct-pointer-types. gcc/c-family/ChangeLog: PR c/106537 * c.opt (Wcompare-distinct-pointer-types): New option. gcc/c/ChangeLog: PR c/106537 * c-typeck.cc (build_binary_op): Warning on comparing distinct pointer types only when -Wcompare-distinct-pointer-types. gcc/testsuite/ChangeLog: PR c/106537 * gcc.c-torture/compile/pr106537-1.c: New test. * gcc.c-torture/compile/pr106537-2.c: Likewise. * gcc.c-torture/compile/pr106537-3.c: Likewise.
2023-08-17Fix code_helper unused argument warning for fr30Jan-Benedict Glaw1-1/+1
fr30 is the only target defining GO_IF_LEGITIMATE_ADDRESS right now, in which case the `code_helper ch` argument to memory_address_addr_space_p() is unused and emits a new warning. gcc/ChangeLog: * recog.cc (memory_address_addr_space_p): Mark possibly unused argument as unused.
2023-08-17[PATCH] RISC-V: Deduplicate #error messages in testsuiteTsukasa OI16-104/+104
"#error Feature macro not defined" is required to test the existence of an extension through the preprocessor. However, multiple occurrence of the exact same error message will confuse the developer once an error is encountered. This commit replaces such error messages to "#error Feature macro for `EXT' not defined" to make which macro is missing. gcc/testsuite/ChangeLog: * gcc.target/riscv/zvkn.c: Deduplicate #error messages. * gcc.target/riscv/zvkn-1.c: Ditto. * gcc.target/riscv/zvknc.c: Ditto. * gcc.target/riscv/zvknc-1.c: Ditto. * gcc.target/riscv/zvknc-2.c: Ditto. * gcc.target/riscv/zvkng.c: Ditto. * gcc.target/riscv/zvkng-1.c: Ditto. * gcc.target/riscv/zvkng-2.c: Ditto. * gcc.target/riscv/zvks.c: Ditto. * gcc.target/riscv/zvks-1.c: Ditto. * gcc.target/riscv/zvksc.c: Ditto. * gcc.target/riscv/zvksc-1.c: Ditto. * gcc.target/riscv/zvksc-2.c: Ditto. * gcc.target/riscv/zvksg.c: Ditto. * gcc.target/riscv/zvksg-1.c: Ditto. * gcc.target/riscv/zvksg-2.c: Ditto.
2023-08-17tree-optimization/111039 - abnormals and bit test mergingRichard Biener2-0/+22
The following guards the bit test merging code in if-combine against the appearance of SSA names used in abnormal PHIs. PR tree-optimization/111039 * tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for SSA_NAME_OCCURS_IN_ABNORMAL_PHI. * gcc.dg/pr111039.c: New testcase.
2023-08-17doc: Fixes to RTL-SSA sample codeAlex Coplan1-12/+12
This patch fixes up the code examples in the RTL-SSA documentation (the sections on making insn changes) to reflect the current API. The main issues are as follows: - rtl_ssa::recog takes an obstack_watermark & as the first parameter. Presumably this is intended to be the change attempt, so I've updated the examples to pass this through. - The variants of recog and restrict_movement that take an ignore predicate have been renamed with an _ignoring suffix, so I've updated callers to use those names. - A couple of minor "obvious" fixes to add a missing address-of operator and correct a variable name. gcc/ChangeLog: * doc/rtl.texi: Fix up sample code for RTL-SSA insn changes.
2023-08-17RISC-V: Fix XPASS slp testcasesLehua Ding10-25/+36
This patch fixs XPASS slp testcases on trunk by making the conditions for xfail stricter. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-1.c: Fix. * gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-17.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-18.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-19.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-2.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-4.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/slp-6.c: Ditto.
2023-08-17bpf: support `naked' function attributes in BPF targetsJose E. Marchesi3-0/+48
The kernel selftests and other BPF programs make extensive use of the `naked' function attribute with bodies written using basic inline assembly. This patch adds support for the attribute to bpf-unkonwn-none, makes it to inhibit warnings due to lack of explicit `return' statement, and updates documentation and testsuite accordingly. Tested in x86_64-linux-gnu host and bpf-unknown-none target. gcc/ChangeLog PR target/111046 * config/bpf/bpf.cc (bpf_attribute_table): Add entry for the `naked' function attribute. (bpf_warn_func_return): New function. (TARGET_WARN_FUNC_RETURN): Define. (bpf_expand_prologue): Add preventive comment. (bpf_expand_epilogue): Likewise. * doc/extend.texi (BPF Function Attributes): Document the `naked' function attribute. gcc/testsuite/ChangeLog * gcc.target/bpf/naked-1.c: New test.
2023-08-17Handle TYPE_OVERFLOW_UNDEFINED vectorized BB reductionsRichard Biener2-14/+109
The following changes the gate to perform vectorization of BB reductions to use needs_fold_left_reduction_p which in turn requires handling TYPE_OVERFLOW_UNDEFINED types in the epilogue code generation by promoting any operations generated there to use unsigned arithmetic. The following does this, there's currently only v16qi where x86 supports a .REDUC_PLUS reduction for integral modes so I had to add a x86 specific testcase using GIMPLE IL. * tree-vect-slp.cc (vect_slp_check_for_roots): Use !needs_fold_left_reduction_p to decide whether we can handle the reduction with association. (vectorize_slp_instance_root_stmt): For TYPE_OVERFLOW_UNDEFINED reductions perform all arithmetic in an unsigned type. * gcc.target/i386/vect-reduc-2.c: New testcase.
2023-08-17testsuite: Remove unused dg-line in ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83benjamin priour1-1/+1
Test case g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C introduced by patch ce8cdf5bcf96a2db6d7b9f656fc9ba58d7942a83 emitted a warning for an unused dg-line variable. This fixes up the blunder. Signed-off-by: benjamin priour <vultkayn@gcc.gnu.org> gcc/testsuite/ChangeLog: * g++.dg/analyzer/fanalyzer-show-events-in-system-headers.C: Remove dg-line var declare_a.
2023-08-17build: Allow for Xcode 15 ld -v outputRainer Orth2-2/+4
Since Xcode 15 beta 6, ld -v output differs from previous versions: * macOS 13/Xcode 14: @(#)PROGRAM:ld PROJECT:ld64-857.1 * macOS 14/Xcode 15: @(#)PROGRAM:ld PROJECT:dyld-1015.1 configure cannot handle the new form, so LD64_VERSION isn't set. This patch fixes this. The autoconf manual states that sed doesn't portably support alternation, so I'm using two separate expressions to extract the version number. Tested on x86_64-apple-darwin23.0.0. 2023-08-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc: * configure.ac (gcc_cv_ld64_version): Allow for dyld in ld -v output. * configure: Regenerate.
2023-08-17RISC-V: Support RVV VFWREDOSUM.VS rounding mode intrinsic APIPan Li4-1/+44
This patch would like to support the rounding mode API for the VFWREDOSUM.VS as the below samples * __riscv_vfwredosum_vs_f32m1_f64m1_rm * __riscv_vfwredosum_vs_f32m1_f64m1_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (widen_freducop): Add frm_opt_type template arg. (vfwredosum_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfwredosum_frm): New intrinsic function def. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-wredosum.c: New test.
2023-08-17RISC-V: Support RVV VFREDOSUM.VS rounding mode intrinsic APIPan Li4-0/+37
This patch would like to support the rounding mode API for the VFREDOSUM.VS as the below samples. * __riscv_vfredosum_vs_f32m1_f32m1_rm * __riscv_vfredosum_vs_f32m1_f32m1_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (vfredosum_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfredosum_frm): New intrinsic function def. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-redosum.c: New test.
2023-08-17RISC-V: Support RVV VFREDUSUM.VS rounding mode intrinsic APIPan Li6-1/+84
This patch would like to support the rounding mode API for the VFREDUSUM.VS as the below samples. * __riscv_vfredusum_vs_f32m1_f32m1_rm * __riscv_vfredusum_vs_f32m1_f32m1_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class freducop): Add frm_op_type template arg. (vfredusum_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfredusum_frm): New intrinsic function def. * config/riscv/riscv-vector-builtins-shapes.cc (struct reduc_alu_frm_def): New class for frm shape. (SHAPE): New declaration. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-redusum.c: New test.
2023-08-17RISC-V: Support RVV VFNCVT.F.{X|XU|F}.W rounding mode intrinsic APIPan Li4-1/+82
This patch would like to support the rounding mode API for the VFNCVT.F.{X|XU|F}.W as the below samples. * __riscv_vfncvt_f_x_w_f32m1_rm * __riscv_vfncvt_f_x_w_f32m1_rm_m * __riscv_vfncvt_f_xu_w_f32m1_rm * __riscv_vfncvt_f_xu_w_f32m1_rm_m * __riscv_vfncvt_f_f_w_f32m1_rm * __riscv_vfncvt_f_f_w_f32m1_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvt_f): Add frm_op_type template arg. (vfncvt_f_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfncvt_f_frm): New intrinsic function def. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-ncvt-f.c: New test.
2023-08-17RISC-V: Support RVV VFNCVT.XU.F.W rounding mode intrinsic APIPan Li4-0/+33
This patch would like to support the rounding mode API for the VFNCVT.XU.F.W as the below samples. * __riscv_vfncvt_xu_f_w_u16mf2_rm * __riscv_vfncvt_xu_f_w_u16mf2_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (vfncvt_xu_frm_obj): New declaration. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfncvt_xu_frm): New intrinsic function def. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-ncvt-xu.c: New test.
2023-08-17RISC-V: Support RVV VFNCVT.X.F.W rounding mode intrinsic APIPan Li6-1/+80
This patch would like to support the rounding mode API for the VFNCVT.X.F.W as the below samples. * __riscv_vfncvt_x_f_w_i16mf2_rm * __riscv_vfncvt_x_f_w_i16mf2_rm_m Signed-off-by: Pan Li <pan2.li@intel.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvt_x): Add frm_op_type template arg. (BASE): New declaration. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (vfncvt_x_frm): New intrinsic function def. * config/riscv/riscv-vector-builtins-shapes.cc (struct narrow_alu_frm_def): New shape function for frm. (SHAPE): New declaration. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-ncvt-x.c: New test.
2023-08-17[Patch 6/6] Support AVX10.1 for AVX512DQ+AVX512VL intrinsHaochen Jiang10-0/+227
gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-vextractf64x2-1.c: New test. * gcc.target/i386/avx10_1-vextracti64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vfpclasspd-1.c: Ditto. * gcc.target/i386/avx10_1-vfpclassps-1.c: Ditto. * gcc.target/i386/avx10_1-vinsertf64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vinserti64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vrangepd-1.c: Ditto. * gcc.target/i386/avx10_1-vrangeps-1.c: Ditto. * gcc.target/i386/avx10_1-vreducepd-1.c: Ditto. * gcc.target/i386/avx10_1-vreduceps-1.c: Ditto.
2023-08-17[Patch 5/6] Support AVX10.1 for AVX512DQ+AVX512VL intrinsHaochen Jiang4-65/+76
gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Remove target attribute. * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_AVX10_1. * config/i386/sse.md (VF_AVX512VLDQ_AVX10_1): New. (VFH_AVX512VLDQ_AVX10_1): Ditto. (VF1_AVX512VLDQ_AVX10_1): Ditto. (<mask_codefor>reducep<mode><mask_name><round_saeonly_name>): Change iterator to VFH_AVX512VLDQ_AVX10_1. Remove target check. (vec_pack<floatprefix>_float_<mode>): Change iterator to VI8_AVX512VLDQ_AVX10_1. Remove target check. (vec_unpack_<fixprefix>fix_trunc_lo_<mode>): Change iterator to VF1_AVX512VLDQ_AVX10_1. Remove target check. (vec_unpack_<fixprefix>fix_trunc_hi_<mode>): Ditto. (VI48F_256_DQVL_AVX10_1): Rename from VI48F_256_DQ. (avx512vl_vextractf128<mode>): Change iterator to VI48F_256_DQVL_AVX10_1. Remove target check. (vec_extract_hi_<mode>_mask): Add TARGET_AVX10_1. (vec_extract_hi_<mode>): Ditto. (avx512vl_vinsert<mode>): Ditto. (vec_set_lo_<mode><mask_name>): Ditto. (vec_set_hi_<mode><mask_name>): Ditto. (avx512dq_rangep<mode><mask_name><round_saeonly_name>): Change iterator to VF_AVX512VLDQ_AVX10_1. Remove target check. (avx512dq_fpclass<mode><mask_scalar_merge_name>): Change iterator to VFH_AVX512VLDQ_AVX10_1. Remove target check. * config/i386/subst.md (mask_avx512dq_condition): Add TARGET_AVX10_1. (mask_scalar_merge): Ditto.
2023-08-17[Patch 4/6] Support AVX10.1 for AVX512DQ+AVX512VL intrinsHaochen Jiang17-0/+430
gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-abs-copysign-1.c: New test. * gcc.target/i386/avx10_1-vandpd-1.c: Ditto. * gcc.target/i386/avx10_1-vandps-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtps2qq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtps2uqq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtqq2pd-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtqq2ps-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtuqq2pd-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtuqq2ps-1.c: Ditto. * gcc.target/i386/avx10_1-vorpd-1.c: Ditto. * gcc.target/i386/avx10_1-vorps-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovd2m-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovm2d-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovm2q-1.c: Ditto. * gcc.target/i386/avx10_1-vpmovq2m-1.c: Ditto. * gcc.target/i386/avx10_1-vxorpd-1.c: Ditto. * gcc.target/i386/avx10_1-vxorps-1.c: Ditto.
2023-08-17[Patch 3/6] Support AVX10.1 for AVX512DQ+AVX512VL intrinsHaochen Jiang4-89/+109
gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Remove target attribute. * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_AVX10_1. * config/i386/i386.cc (standard_sse_constant_opcode): Add TARGET_AVX10_1. * config/i386/sse.md: (VI48_AVX512VL_AVX10_1): New. (VI48_AVX512VLDQ_AVX10_1): Ditto. (VF2_AVX512VL): Remove. (VI8_256_512VLDQ_AVX10_1): Rename from VI8_256_512. Add TARGET_AVX10_1. (*<code><mode>3<mask_name>): Change isa attribute to avx10_1_or_avx512dq. Add TARGET_AVX10_1. (<code><mode>3): Add TARGET_AVX10_1. Change isa attr to avx10_1_or_avx512vl. (<mask_codefor>avx512dq_cvtps2qq<mode><mask_name><round_name>): Change iterator to VI8_256_512VLDQ_AVX10_1. Remove target check. (<mask_codefor>avx512dq_cvtps2qqv2di<mask_name>): Add TARGET_AVX10_1. (<mask_codefor>avx512dq_cvtps2uqq<mode><mask_name><round_name>): Change iterator to VI8_256_512VLDQ_AVX10_1. Remove target check. (<mask_codefor>avx512dq_cvtps2uqqv2di<mask_name>): Add TARGET_AVX10_1. (float<floatunssuffix><sseintvecmodelower><mode>2<mask_name><round_name>): Change iterator to VF2_AVX512VLDQ_AVX10_1. Remove target check. (float<floatunssuffix><sselongvecmodelower><mode>2<mask_name><round_name>): Change iterator to VF1_128_256VLDQ_AVX10_1. Remove target check. (float<floatunssuffix>v4div4sf2<mask_name>): Add TARGET_AVX10_1. (avx512dq_float<floatunssuffix>v2div2sf2): Ditto. (*avx512dq_float<floatunssuffix>v2div2sf2): Ditto. (float<floatunssuffix>v2div2sf2): Ditto. (float<floatunssuffix>v2div2sf2_mask): Ditto. (*float<floatunssuffix>v2div2sf2_mask): Ditto. (*float<floatunssuffix>v2div2sf2_mask_1): Ditto. (<avx512>_cvt<ssemodesuffix>2mask<mode>): Change iterator to VI48_AVX512VLDQ_AVX10_1. Remove target check. (<avx512>_cvtmask2<ssemodesuffix><mode>): Ditto. (*<avx512>_cvtmask2<ssemodesuffix><mode>): Change iterator to VI48_AVX512VL_AVX10_1. Remove target check. Change when constraint is enabled.
2023-08-17RISC-V: Fix incorrect VTYPE fusion for floating point scalar move insn[PR111037]Juzhe-Zhong3-2/+43
void foo(_Float16 y, int64_t *i64p) { vint64m1_t vx =__riscv_vle64_v_i64m1 (i64p, 1); vx = __riscv_vadd_vv_i64m1 (vx, vx, 1); vfloat16m1_t vy =__riscv_vfmv_s_f_f16m1 (y, 1); asm volatile ("# use %0 %1" : : "vr"(vx), "vr" (vy)); } zve64f: foo: vsetivli zero,1,e16,mf4,ta,ma vle64.v v1,0(a0) vfmv.s.f v2,fa0 vsetvli zero,zero,e64,m1,ta,ma vadd.vv v1,v1,v1 zve64d: foo: vsetivli zero,1,e64,m1,ta,ma vle64.v v1,0(a0) vfmv.s.f v2,fa0 vadd.vv v1,v1,v1 gcc/ChangeLog: PR target/111037 * config/riscv/riscv-vsetvl.cc (float_insn_valid_sew_p): New function. (second_sew_less_than_first_sew_p): Fix bug. (first_sew_less_than_second_sew_p): Ditto. gcc/testsuite/ChangeLog: PR target/111037 * gcc.target/riscv/rvv/base/pr111037-1.c: New test. * gcc.target/riscv/rvv/base/pr111037-2.c: New test.
2023-08-17Support AVX10.1 for AVX512DQ+AVX512VL intrinsHaochen Jiang13-0/+318
gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_1-vandnpd-1.c: New test. * gcc.target/i386/avx10_1-vandnps-1.c: Ditto. * gcc.target/i386/avx10_1-vbroadcastf32x2-1.c: Ditto. * gcc.target/i386/avx10_1-vbroadcastf64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vbroadcasti32x2-1.c: Ditto. * gcc.target/i386/avx10_1-vbroadcasti64x2-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtpd2qq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvtpd2uqq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvttpd2qq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvttpd2uqq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvttps2qq-1.c: Ditto. * gcc.target/i386/avx10_1-vcvttps2uqq-1.c: Ditto. * gcc.target/i386/avx10_1-vpmullq-1.c: Ditto.