Age | Commit message (Collapse) | Author | Files | Lines |
|
-march was require canonical order before, however it's not easy for
most user when we have so many extension, so this patch is relax the
constraint, -march accept the ISA string in any order, it only has few
requirement:
1. Must start with rv[32|64][e|i|g].
2. Multi-letter and single letter extension must be separated by
at least one underscore(`_`).
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_single_std_ext): New parameter.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::parse): Relax the order for the input of ISA
string.
* config/riscv/riscv-subset.h
(riscv_subset_list::parse_single_std_ext): New parameter.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-33.c: New.
* gcc.target/riscv/arch-34.c: New.
|
|
Minor refactor, preparation for further change.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_base_ext): New.
(riscv_subset_list::parse): Extract part of logic into
riscv_subset_list::parse_base_ext.
* config/riscv/riscv-subset.h (riscv_subset_list::parse_base_ext):
New.
|
|
Use "does not" rather than "cannot", because it's implementation issue.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_override_options_internal): Tweak
sorry message.
|
|
UNSPEC_CLMUL is defined to define_c_enum in riscv.md, so
it shouldn't be redefined to define_int_iterator again.
gcc/ChangeLog:
* config/riscv/vector-crypto.md (UNSPEC_CLMUL): Rename to
UNSPEC_CLMUL_VC.
|
|
2024-01-18 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR tree-optimization/69807
* config/pa/pa.cc (pa_option_override): Set flag_pie on TARGET_64BIT.
gcc/testsuite/ChangeLog:
* gcc.dg/pic-2.c: Skip on hppa*64*-*-*.
|
|
There are some xtheadvector instructions that differ from RVV1.0
apart from simply adding "th." prefix. For example, RVV1.0
load/store instructions will have SEW while xtheadvector not;
RVV1.0 will have "o" for indexed-ordered store instructions while
xtheadvecotr not; xtheadvector and RVV1.0 have different
vnsrl/vnsra/vfncvt suffix (vv/vx/vi vs wv/wx/wi).
To address this issue without duplicating patterns, we use ASM
targethook to rewrite the whole string of the instructions. We
identify different instructions from the corresponding attribute.
gcc/ChangeLog:
* config/riscv/thead.cc
(th_asm_output_opcode): Rewrite some instructions.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
For th.vmadc/th.vmsbc as well as narrowing arithmetic instructions
and floating-point compare instructions, an illegal instruction
exception will be raised if the destination vector register overlaps
a source vector register group.
To handle this issue, we add an attribute "spec_restriction" to disable
some alternatives for xtheadvector.
gcc/ChangeLog:
* config/riscv/riscv.md (none,thv,rvv): New attribute.
(no,yes): Add an attribute to disable alternative
for xtheadvector or RVV1.0.
* config/riscv/vector.md:
Disable alternatives that destination register overlaps
source register group for xtheadvector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
This patch only involves the generation of xtheadvector
special load/store instructions and vext instructions.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc
(class th_loadstore_width): Define new builtin bases.
(class th_extract): Define new builtin bases.
(BASE): Define new builtin bases.
* config/riscv/riscv-vector-builtins-bases.h:
Define new builtin class.
* config/riscv/riscv-vector-builtins-shapes.cc
(struct th_loadstore_width_def): Define new builtin shapes.
(struct th_indexed_loadstore_width_def):
Define new builtin shapes.
(struct th_extract_def): Define new builtin shapes.
(SHAPE): Define new builtin shapes.
* config/riscv/riscv-vector-builtins-shapes.h:
Define new builtin shapes.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_FUNCTION):
Redefine DEF_RVV_FUNCTION for XTheadVector special intrinsics.
* config/riscv/riscv-vector-builtins.h
(enum required_ext): Add new XTheadVector member.
(struct function_group_info): Likewise.
* config/riscv/t-riscv:
Add thead-vector-builtins-functions.def
* config/riscv/thead-vector.md
(@pred_mov_width<vlmem_op_attr><mode>): Add new patterns.
(*pred_mov_width<vlmem_op_attr><mode>): Likewise.
(@pred_store_width<vlmem_op_attr><mode>): Likewise.
(@pred_strided_load_width<vlmem_op_attr><mode>): Likewise.
(@pred_strided_store_width<vlmem_op_attr><mode>): Likewise.
(@pred_indexed_load_width<vlmem_op_attr><mode>): Likewise.
(@pred_th_extract<mode>): Likewise.
(*pred_th_extract<mode>): Likewise.
* config/riscv/thead-vector-builtins-functions.def: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/vlb-vsb.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlbu-vsb.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlh-vsh.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlhu-vsh.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlw-vsw.c: New test.
* gcc.target/riscv/rvv/xtheadvector/vlwu-vsw.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share same patterns as RVV1.0 instructions, we will
use ASM targethook to rewrite the whole string of the instructions in
the following patches.
For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in vector.md in order
not to generate instructions that xtheadvector does not support,
like vmv1r.
gcc/ChangeLog:
* config.gcc: Add files for XTheadVector intrinsics.
* config/riscv/autovec.md: Guard XTheadVector.
* config/riscv/predicates.md: Disable immediate vl
for XTheadVector.
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic):
Add pragma for XTheadVector.
* config/riscv/riscv-string.cc (riscv_expand_block_move):
Guard XTheadVector.
* config/riscv/riscv-v.cc (vls_mode_valid_p):
Avoid autovec.
* config/riscv/riscv-vector-builtins-bases.cc:
Do not normalize vsetvl instructions for XTheadVector.
* config/riscv/riscv-vector-builtins-shapes.cc (check_type):
New check type function.
(build_one): Adjust for XTheadVector.
* config/riscv/riscv-vector-switch.def (ENTRY):
Disable fractional mode for the XTheadVector extension.
(TUPLE_ENTRY): Likewise.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
Guard XTheadVector.
(riscv_preferred_simd_mode): Likewsie.
(riscv_autovectorize_vector_modes): Likewise.
(riscv_vector_mode_supported_any_target_p): Likewise.
(TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise.
* config/riscv/thead.cc (th_asm_output_opcode):
Rewrite vsetvl instructions.
* config/riscv/vector.md:
Include thead-vector.md and change fractional LMUL
into 1 for vbool.
* config/riscv/riscv_th_vector.h: New file.
* config/riscv/thead-vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Add XTheadVector.
* gcc.target/riscv/rvv/base/abi-1.c: Exclude XTheadVector.
* lib/target-supports.exp: Add target for XTheadVector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
This patch adds th. prefix to all XTheadVector instructions by
implementing new assembly output functions. We only check the
prefix is 'v', so that no extra attribute is needed.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_asm_output_opcode):
Add new function to add assembler insn code prefix/suffix.
(th_asm_output_opcode):
Add Thead function to add assembler insn code prefix/suffix.
* config/riscv/riscv.cc (riscv_asm_output_opcode):
Implement function to add assembler insn code prefix/suffix.
* config/riscv/riscv.h (ASM_OUTPUT_OPCODE):
Add new function to add assembler insn code prefix/suffix.
* config/riscv/thead.cc (th_asm_output_opcode):
Implement Thead function to add assembler insn code
prefix/suffix.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/xtheadvector/prefix.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
This patch is to introduce basic XTheadVector support
(march string parsing and a test for __riscv_xtheadvector)
according to https://github.com/T-head-Semi/thead-extension-spec/
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse): Add new vendor extension.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Add test marco.
* config/riscv/riscv.opt: Add new mask.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-__riscv_th_v_intrinsic.c: New test.
* gcc.target/riscv/rvv/xtheadvector.c: New test.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
When we have @rpath support by virtue of the OS version we're hosting on
we still need to omit those rpath entries when targeting < 10.5 (or the
linker will complain). To do this we (maybe ab-)use a property of the
spec function expansion that a non-null return value can be used as the
true input to a second spec (whereas, unfortunately, we cannot pass specs
to the version function at present).
gcc/ChangeLog:
* config/darwin.h (DARWIN_RPATH_SPEC): Arrange for the %P spec
to be conditional on macosx-version-min.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
We have a typo in the metadata for assigning NSStrings to a specific
section for the V1 (32b) ABI. When that is fixed we should never see
the case where the section needs to be deduced from the properties of
the DECLs.
gcc/ChangeLog:
* config/darwin.cc (darwin_objc1_section): Use the correct
meta-data version for constant strings.
(machopic_select_section): Assert if we fail to handle CFString
sections as Obejctive-C meta-data or drectly.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
Although this only fires for one of the Darwin sub-ports, it is latent
elsewhere, it is also a regression c.f. the Darwin system compiler.
In the code we imported from an earlier branch, CFString objects (which
are constant aggregates) are constructed as CONST_DECLs. Although our
current documentation suggests that these are reserved for enumeration
values, in fact they are used elsewhere in the compiler for constants.
This includes Objective-C where they are used to form NSString constants.
In the particular case, we take the address of the constant and that
triggers varasm.cc:decode_addr_constant, which does not currently support
CONST_DECL.
If there is a general intent to allow/encourage wider use of CONST_DECL,
then we should fix decode_addr_constant to look through these and evaluate
the initializer (a two-line patch, but I'm not suggesting it for stage-4).
We also need to update the GCC internals documentation to allow for the
additional uses.
This patch is Darwin-local and fixes the problem by making the CFString
constants into regular variable but TREE_CONSTANT+TREE_READONLY. I plan
to back-port this to the open branches once it has baked a while on trunk.
Since, for Darwin, the Objective-C default is to construct constant
NSString objects as CFStrings; this will also cover the majority of cases
there (this patch does not make any changes to Objective-C NSStrings).
PR target/105522
gcc/ChangeLog:
* config/darwin.cc (machopic_select_section): Handle C and C++
CFStrings.
(darwin_rename_builtins): Move this out of the CFString code.
(darwin_libc_has_function): Likewise.
(darwin_build_constant_cfstring): Create an anonymous var to
hold each CFString.
* config/darwin.h (ASM_OUTPUT_LABELREF): Handle constant
CFstrings.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
gcc/
* config/avr/avr-log.cc: Tabify.
|
|
While running various benchmarks, I notice we miss vi variant support for integer comparison.
That is, we can vectorize code into vadd.vi but we can't vectorize into vmseq.vi.
Consider this following case:
void
foo (int n, int **__restrict a)
{
int b;
int c;
int d;
for (b = 0; b < n; b++)
for (long e = 8; e > 0; e--)
a[b][e] = a[b][e] == 15;
}
Before this patch:
vsetivli zero,4,e32,m1,ta,ma
vmv.v.i v4,15
vmv.v.i v3,1
vmv.v.i v2,0
.L3:
ld a5,0(a1)
addi a4,a5,4
addi a5,a5,20
vle32.v v1,0(a5)
vle32.v v0,0(a4)
vmseq.vv v0,v0,v4
After this patch:
ld a5,0(a1)
addi a4,a5,4
addi a5,a5,20
vle32.v v1,0(a5)
vle32.v v0,0(a4)
vmseq.vi v0,v0,15
It's the missing feature caused by our some mistakes, support vi variant for vec_cmp like other patterns (add, sub, ..., etc).
Tested with no regression, ok for trunk ?
gcc/ChangeLog:
* config/riscv/autovec.md: Support vi variant.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-1.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-2.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-3.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-4.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-5.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-6.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-7.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-8.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/cmp_vi-9.c: New test.
* gcc.target/riscv/rvv/autovec/cmp/macro.h: New test.
|
|
gcc/
* config/avr/avr-devices.cc: Tabify.
|
|
gcc/
* config/avr/avr-c.cc: Tabify.
|
|
gcc/
* config/avr/driver-avr.cc: Tabify.
|
|
gcc/
* config/avr/gen-avr-mmcu-texi.cc: Tabify.
|
|
gcc/
* config/avr/gen-avr-mmcu-specs.cc: Tabify.
|
|
As I wrote recently, Bool is an undocumented unsupported keyword, as
can be seen by
grep Bool doc/options.texi *.awk
The option parsing just parses and ignores all keywords it doesn't handle.
But, because it isn't a supported keyword, I think we shouldn't have it in
*.opt files, because that just means people copy it over to other places
even when it doesn't have any effect.
Tested with a cross to riscv64-linux, none of the generated
options.{h,cc} options-{save,urls}.cc
files change with the patch, only optionlist does (but that is just
used as a source for those files).
2024-01-18 Jakub Jelinek <jakub@redhat.com>
* config/riscv/riscv.opt (mshorten-memrefs, mrelax, mcsr-check,
minline-strcmp, minline-strncmp, minline-strlen,
-param=riscv-vector-abi): Remove Bool keywords.
|
|
x86_function_profiler emits assembly directly into file and only emits
AT&T syntax. The following patch adjusts it to emit MASM syntax
if -masm=intel.
As it doesn't use asm_fprintf, I can't use {|} syntax for the dialects.
I've tested using
for i in -mcmodel=large "-mcmodel=large -fpic" "" -fpic "-m32 -fpic" "-m32"; do
./xgcc -B ./ -c -O2 -fprofile $i -masm=att pr113122.c -o pr113122.o1;
./xgcc -B ./ -c -O2 -fprofile $i -masm=intel pr113122.c -o pr113122.o2;
objdump -dr pr113122.o1 > /tmp/1; objdump -dr pr113122.o2 > /tmp/2;
diff -up /tmp/1 /tmp/2; done
that the emitted sequences are identical after assembly.
2024-01-18 Jakub Jelinek <jakub@redhat.com>
PR target/113122
* config/i386/i386.cc (x86_function_profiler): Add -masm=intel
support. Add missing space after , in emitted assembly in some
cases. Formatting fixes.
* gcc.target/i386/pr113122-1.c: New test.
* gcc.target/i386/pr113122-2.c: New test.
* gcc.target/i386/pr113122-3.c: New test.
* gcc.target/i386/pr113122-4.c: New test.
|
|
We don't allow SImode in FCC, so constraint z is never really used
here.
gcc/ChangeLog:
* config/loongarch/loongarch.md (movsi_internal): Remove
constraint z.
|
|
gcc/
* config/avr/gen-avr-mmcu-specs.cc (diagnose_rodata_in_ram): Fix typo
in the diagnostic, and capitalize the device name.
(print_mcu): Generate specs such that:
<*check_rodata_in_ram>: New.
<*cc1_misc>: Use check_rodata_in_ram instead of cc1_rodata_in_ram.
<*link_misc>: Use check_rodata_in_ram instead of link_rodata_in_ram.
<*cc1_rodata_in_ram, *link_rodata_in_ram>: Remove.
|
|
table belongs.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_split_symbol):
Assign the '/u' attribute to the mem.
gcc/testsuite/ChangeLog:
* g++.target/loongarch/got-load.C: New test.
|
|
V3: Rebase to trunk and commit it.
This patch fixes SPEC2017 cam4 mismatch issue due to we miss has compatible check
for conflict vsetvl fusion.
Buggy assembler before this patch:
.L69:
vsetvli a5,s1,e8,mf4,ta,ma -> buggy vsetvl
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.i v1,0
vse8.v v1,0(a5)
j .L37
.L68:
vsetvli a5,s1,e8,mf4,ta,ma -> buggy vsetvl
vsetivli zero,8,e8,mf2,ta,ma
addi a3,a5,8
vmv.v.i v1,0
vse8.v v1,0(a5)
vse8.v v1,0(a3)
addi a4,a4,-16
li a3,8
bltu a4,a3,.L37
j .L69
.L67:
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.i v1,0
vse8.v v1,0(a5)
addi a5,sp,56
vse8.v v1,0(a5)
addi s4,sp,64
addi a3,sp,72
vse8.v v1,0(s4)
vse8.v v1,0(a3)
addi a4,a4,-32
li a3,16
bltu a4,a3,.L36
j .L68
After this patch:
.L63:
ble s1,zero,.L49
slli a4,s1,3
li a3,32
addi a5,sp,48
bltu a4,a3,.L62
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.i v1,0
vse8.v v1,0(a5)
addi a5,sp,56
vse8.v v1,0(a5)
addi s4,sp,64
addi a3,sp,72
vse8.v v1,0(s4)
addi a4,a4,-32
addi a5,sp,80
vse8.v v1,0(a3)
.L35:
li a3,16
bltu a4,a3,.L36
addi a3,a5,8
vmv.v.i v1,0
addi a4,a4,-16
vse8.v v1,0(a5)
addi a5,a5,16
vse8.v v1,0(a3)
.L36:
li a3,8
bltu a4,a3,.L37
vmv.v.i v1,0
vse8.v v1,0(a5)
Tested on both RV32/RV64 no regression, Ok for trunk ?
PR target/113429
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): Fix bug.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-4.c: Adapt test.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-5.c: Ditto.
|
|
[PR113221]
So the problem here is that aarch64_ldp_reg_operand will all subreg even subreg of lo_sum.
When LRA tries to fix that up, all things break. So the fix is to change the check to only
allow reg and subreg of regs.
Note the tendancy here is to use register_operand but that checks the mode of the register
but we need to allow a mismatch modes for this predicate for now.
Built and tested for aarch64-linux-gnu with no regressions
(Also tested with the LD/ST pair pass back on).
PR target/113221
gcc/ChangeLog:
* config/aarch64/predicates.md (aarch64_ldp_reg_operand): For subreg,
only allow REG operands instead of allowing all.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr113221-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
When staring at VSETVL pass for PR/113429, spotted some minor
improvements.
1. For readablity, remove some redundant condition check in Phase 2
function earliest_fuse_vsetvl_info ().
2. Add iteration count in debug prints in same function.
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (earliest_fuse_vsetvl_info):
Remove redundant checks in else condition for readablity.
(earliest_fuse_vsetvl_info) Print iteration count in debug
prints.
(earliest_fuse_vsetvl_info) Fix misleading vsetvl info
dump details in certain cases.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
RVV requires VSET?VL? instructions to dynamically configure VLEN at
runtime. There's a custom pass to do that which has a simple mode
which generates a VSETVL for each V insn and a lazy/optimal mode which
uses LCM dataflow to move VSETVL around, identify/delete the redundant
ones.
Currently simple mode is default for !optimize invocations while lazy
mode being the default.
This patch allows simple mode to be forced via a toggle independent of
the optimization level. A lot of gcc developers are currently doing this
in some form in their local setups, as in the initial phase of autovec
development issues are expected. It makes sense to provide this facility
upstream. It could potentially also be used by distro builder for any
quick workarounds in autovec bugs of future.
gcc/ChangeLog:
* config/riscv/riscv.opt: New -param=vsetvl-strategy.
* config/riscv/riscv-opts.h: New enum vsetvl_strategy_enum.
* config/riscv/riscv-vsetvl.cc
(pre_vsetvl::pre_global_vsetvl_info): Use vsetvl_strategy.
(pass_vsetvl::execute): Use vsetvl_strategy.
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
I have noticed quite bad pasto in handling of X86_TUNE_AVOID_512FMA_CHAINS. At the
moment it is ignored, but X86_TUNE_AVOID_256FMA_CHAINS controls 512FMA too.
This patch fixes it, we may want to re-check how that works on AVX512 machines.
gcc/ChangeLog:
* config/i386/i386-options.cc (ix86_option_override_internal): Fix
handling of X86_TUNE_AVOID_512FMA_CHAINS.
|
|
$GP is used for expanding GOT load, but in the afterward passes,
a temporary register is tried to replace $gp.
If sucess, we have no need to store and reload $gp. The example
of failure is that the function calls a preemtive function.
We shouldn't use $GP for any other purpose in the code we generate.
If a user's inline asm code clobbers $GP, it's their duty to save
and restore $GP.
gcc
* config/mips/mips.cc (mips_compute_frame_info): If another
register is used as global_pointer, mark $GP live false.
gcc/testsuite
* gcc.target/mips/mips.exp (mips_option_groups):
Add -mxgot/-mno-xgot options.
* gcc.target/mips/xgot-n32-avoid-gp.c: New test.
* gcc.target/mips/xgot-n32-need-gp.c: New test.
|
|
Add support for -mcpu=cobalt-100 (Neoverse N2 with a different implementer ID).
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add 'cobalt-100' CPU.
* config/aarch64/aarch64-tune.md: Regenerated.
* doc/invoke.texi (-mcpu): Add cobalt-100 core.
|
|
GCC tends to optimistically create CONST of globals with an immediate offset.
However it is almost always better to CSE addresses of globals and add immediate
offsets separately (the offset could be merged later in single-use cases).
Splitting CONST expressions with an index in aarch64_legitimize_address fixes
part of PR112573.
gcc/ChangeLog:
PR target/112573
* config/aarch64/aarch64.cc (aarch64_legitimize_address): Reassociate
badly formed CONST expressions.
gcc/testsuite/ChangeLog:
PR target/112573
* gcc.target/aarch64/pr112573.c: Add new test.
|
|
This is to handle the membar_empty instruction that can be generated
when compiling for UT699.
gcc/ChangeLog:
* config/sparc/sparc.cc (next_active_non_empty_insn): Length 0 treated as empty
|
|
LEON now uses the standard V8 membar patterns that contains an ldstub
instruction. This instruction needs to be aligned properly when the
GR712RC errata workaround is enabled.
gcc/ChangeLog:
* config/sparc/sparc.cc (atomic_insn_for_leon3_p): Treat membar_storeload as atomic
* config/sparc/sync.md (membar_storeload): Turn into named insn
and add GR712RC errata workaround.
(membar_v8): Add GR712RC errata workaround.
|
|
LEON5 has a deeper write-buffer and hence stb is not enough to flush a
write out. For compatibility, use the default V8 approach for both
LEON3 and LEON5.
This reverts commit 49cc765db35a5a21cab2aece27a44983fa70b94b,
"sync.md (*membar_storeload_leon3): New insn."
gcc/ChangeLog:
* config/sparc/sync.md (*membar_storeload_leon3): Remove
(*membar_storeload): Enable for LEON
|
|
gcc/
* config/avr/avr-mcus.def (avr16eb14, avr16eb20, avr16eb28, avr16eb32)
(avr16ea28, avr16ea32, avr16ea48, avr32ea28, avr32ea32, avr32ea48): Add.
* doc/avr-mmcu.texi: Regenerate.
|
|
As PR113404 mentioned: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113404
We have ICE when we enable RVV in big-endian mode:
during RTL pass: expand
a-float-point-dynamic-frm-66.i:2:14: internal compiler error: in to_constant, at poly-int.h:588
0xab4c2c poly_int<2u, unsigned short>::to_constant() const
/repo/gcc-trunk/gcc/poly-int.h:588
0xab4de1 poly_int<2u, unsigned short>::to_constant() const
/repo/gcc-trunk/gcc/tree.h:4055
0xab4de1 default_function_arg_padding(machine_mode, tree_node const*)
/repo/gcc-trunk/gcc/targhooks.cc:844
0x12e2327 locate_and_pad_parm(machine_mode, tree_node*, int, int, int, tree_node*, args_size*, locate_and_pad_arg_data*)
/repo/gcc-trunk/gcc/function.cc:4061
0x12e2aca assign_parm_find_entry_rtl
/repo/gcc-trunk/gcc/function.cc:2614
0x12e2c89 assign_parms
/repo/gcc-trunk/gcc/function.cc:3693
0x12e59df expand_function_start(tree_node*)
/repo/gcc-trunk/gcc/function.cc:5152
0x112fafb execute
/repo/gcc-trunk/gcc/cfgexpand.cc:6739
Report users that we don't support RVV in big-endian mode for the following reasons:
1. big-endian in RISC-V is pretty rare case.
2. We didn't test RVV in big-endian and we don't have enough time to test it since it's stage 4 now.
Naive disallow RVV in big-endian.
Tested no regression, ok for trunk ?
gcc/ChangeLog:
PR target/113404
* config/riscv/riscv.cc (riscv_override_options_internal): Report sorry
for RVV in big-endian mode.
gcc/testsuite/ChangeLog:
PR target/113404
* gcc.target/riscv/rvv/base/big_endian-1.c: New test.
* gcc.target/riscv/rvv/base/big_endian-2.c: New test.
|
|
Thanks the
https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/389, we
need not to maintain the psabi checking any more.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_arg_has_vector): Delete.
(riscv_pass_in_vector_p): Delete.
(riscv_init_cumulative_args): Delete the checking.
(riscv_get_arg_info): Delete the checking.
(riscv_function_value): Delete the checking.
* config/riscv/riscv.h: Delete the member for checking.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/binop_vx_constraint-120.c: Delete the -Wno-psabi.
* gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Ditto.
* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: Ditto.
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Ditto.
* gcc.target/riscv/rvv/base/pr110109-2.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-9.c: Ditto.
* gcc.target/riscv/rvv/base/spill-10.c: Ditto.
* gcc.target/riscv/rvv/base/spill-11.c: Ditto.
* gcc.target/riscv/rvv/base/spill-9.c: Ditto.
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Ditto.
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Ditto.
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Ditto.
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Ditto.
* gcc.target/riscv/rvv/base/vector-abi-1.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-2.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-3.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-4.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-5.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-6.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-7.c: Removed.
* gcc.target/riscv/rvv/base/vector-abi-8.c: Removed.
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com<mailto:yanzhang.wang@intel.com>>
|
|
This patch adds C intrinsics for Bitmanip Extension.
RISCV_BUILTIN_NO_PREFIX is a new riscv_builtin_description like RISCV_BUILTIN.
But it uses CODE_FOR_##INSN rather than CODE_FOR_riscv_##INSN.
Changed orcb, clmul, brev8 pattern's mode form X to GPR because orcbsi, clmul_si,
brev8_si are both included in rv32 and rv64. Test them in scalar_bitmanip_intrinsic-64-emulated.c.
gcc/ChangeLog:
* config.gcc: Include riscv_bitmanip.h.
* config/riscv/bitmanip.md: Changed mode form X to GPR in orcb and clmul pattern.
* config/riscv/crypto.md: Changed mode form X to GPR in brev8 pattern.
* config/riscv/riscv-builtins.cc (AVAIL): Adding new bitmanip builtins.
(RISCV_BUILTIN_NO_PREFIX): New helper macro.
* config/riscv/riscv-cmo.def (RISCV_BUILTIN): Add '_32'/'_64' postfix to builtins.
* config/riscv/riscv-ftypes.def (2): New ftypes.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): New builtins.
(RISCV_BUILTIN_NO_PREFIX): Likewise.
* config/riscv/riscv_bitmanip.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_bitmanip_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64.c: New test.
|
|
This patch adds C intrinsics for Scalar Crypto Extension.
gcc/ChangeLog:
* config.gcc: Include riscv_crypto.h.
* config/riscv/riscv_crypto.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_crypto_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_crypto_intrinsic-64.c: New test.
|
|
driver-avr.cc contains a spec that discriminates bwtween cores
and devices by means of a mmcu=avr* spec pattern. This does not
work for new devices like AVR128* which also start with mmcu=avr
like all cores do. The patch uses a new spec function in order to
tell apart cores from devices.
gcc/
PR target/107201
* config/avr/avr.h (EXTRA_SPEC_FUNCTIONS): Add no-devlib, avr_no_devlib.
* config/avr/driver-avr.cc (avr_no_devlib): New function.
(avr_devicespecs_file): Use it to remove -nodevicelib from the
options for cores only.
* config/avr/avr-arch.h (avr_get_parch): New prototype.
* config/avr/avr-devices.cc (avr_get_parch): New function.
|
|
coremark-pro
This patch fixes -70% performance drop from GCC-13.2 to GCC-14 with -march=rv64gcv in real hardware.
The root cause is incorrect cost model cause inefficient vectorization which makes us performance drop significantly.
So this patch does:
1. Adjust vector to scalar cost by introducing v to scalar reg move.
2. Adjust vec_construct cost since we does spend NUNITS instructions to construct the vector.
Tested on both RV32/RV64 no regression, Rebase to the trunk and commit it as it is approved by Robin.
PR target/113247
gcc/ChangeLog:
* config/riscv/riscv-protos.h (struct regmove_vector_cost): Add vector to scalar regmove.
* config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Ditto.
* config/riscv/riscv.cc (riscv_builtin_vectorization_cost): Adjust vec_construct cost.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/reduc-19.c: Adapt test.
* gcc.target/riscv/rvv/autovec/vls/reduc-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/reduc-21.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-2.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113247-4.c: New test.
|
|
Rebase in v3: Rebase to the trunk and commit it as it's approved by Robin.
Update in v2: Add dynmaic lmul test.
This patch fixes the regression between GCC 13.2.0 and trunk GCC (GCC-14)
GCC 13.2.0:
lui a5,%hi(a)
li a4,19
sb a4,%lo(a)(a5)
li a0,0
ret
Trunk GCC:
vsetvli a5,zero,e8,mf2,ta,ma
li a4,-32768
vid.v v1
vsetvli zero,zero,e16,m1,ta,ma
addiw a4,a4,104
vmv.v.i v3,15
lui a1,%hi(a)
li a0,19
vsetvli zero,zero,e8,mf2,ta,ma
vadd.vi v1,v1,1
sb a0,%lo(a)(a1)
vsetvli zero,zero,e16,m1,ta,ma
vzext.vf2 v2,v1
vmv.v.x v1,a4
vminu.vv v2,v2,v3
vsrl.vv v1,v1,v2
vslidedown.vi v1,v1,17
vmv.x.s a0,v1
snez a0,a0
ret
The root cause we are vectorizing the codes inefficiently since we doesn't cost len when NITERS < VF.
Leverage loop control of mask targets or rs6000 fixes the regression.
Tested no regression. Ok for trunk ?
PR target/113281
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (costs::adjust_vect_cost_per_loop): New function.
(costs::finish_cost): Adjust cost for LOOP LEN with NITERS < VF.
* config/riscv/riscv-vector-costs.h: New function.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: New test.
|
|
Notice the m_num_vector_iterations is not used, remove the redundant codes.
Committed.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (costs::analyze_loop_vinfo):
Remove m_num_vector_iterations.
* config/riscv/riscv-vector-costs.h: Ditto.
|
|
Multilib options -mdouble= and -mlong-double= are not orthogonal:
TARGET_HANDLE_OPTION = avr-common.cc::avr_handle_option() sets them
such that sizeof(double) <= sizeof(long double) is always true.
gcc/
PR target/113156
* config/avr/avr.opt (-mdouble, -mlong-double): Add "Save" flag.
(-mbranch-cost): Set "Optimization" flag.
|
|
This patch fixes the following FAILs:
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.c-torture/execute/pr68532.c -O0 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O1 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O2 execution test
FAIL: gcc.c-torture/execute/pr68532.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
FAIL: gcc.c-torture/execute/pr68532.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr68532.c -Os execution test
FAIL: gcc.c-torture/execute/pr68532.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
Running target riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
Running target riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/vect/pr60196-1.c execution test
FAIL: gcc.dg/vect/pr60196-1.c -flto -ffat-lto-objects execution test
The root cause is attributes of ternary intructions are incorrect which cause AVL prop PASS and VSETVL PASS behave
incorrectly.
Tested no regression and committed.
PR target/113393
gcc/ChangeLog:
* config/riscv/vector.md: Fix ternary attributes.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr113393-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr113393-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr113393-3.c: New test.
|
|
These devices see a 32 KiB block of their program memory (flash) in
the RAM address space. This can be used to support .rodata in flash
provided Binutils support PR31124 (Add new emulations which locate
.rodata in flash). This patch does the following:
* configure checks availability of Binutils PR31124.
* Add new command line options -mrodata-in-ram and -mflmap.
While -flmap is for internal usage (communicate hardware properties
from device-specs to the compiler proper), -mrodata-in-ram is a user
space option that allows to return to the current rodata-in-ram layout.
* Adjust gen-avr-mmcu-specs.cc so that device-specs are generated
that sanity check options, and that translate -m[no-]rodata-in-ram
to its emulation.
* Objects in .rodata don't drag __do_copy_data.
* Document new options and built-in macros.
PR target/112944
gcc/
* configure.ac [target=avr]: Check availability of emulations
avrxmega2_flmap and avrxmega4_flmap, resulting in new config vars
HAVE_LD_AVR_AVRXMEGA2_FLMAP and HAVE_LD_AVR_AVRXMEGA4_FLMAP.
* configure: Regenerate.
* config.in: Regenerate.
* doc/invoke.texi (AVR Options): Document -mflmap, -mrodata-in-ram,
__AVR_HAVE_FLMAP__, __AVR_RODATA_IN_RAM__.
* config/avr/avr.opt (-mflmap, -mrodata-in-ram): New options.
* config/avr/avr-arch.h (enum avr_device_specific_features):
Add AVR_ISA_FLMAP.
* config/avr/avr-mcus.def (AVR_MCU) [avr64*, avr128*]: Set isa flag
AVR_ISA_FLMAP.
* config/avr/avr.cc (avr_arch_index, avr_has_rodata_p): New vars.
(avr_set_core_architecture): Set avr_arch_index.
(have_avrxmega2_flmap, have_avrxmega4_flmap)
(have_avrxmega3_rodata_in_flash): Set new static const bool according
to configure results.
(avr_rodata_in_flash_p): New function using them.
(avr_asm_init_sections): Let readonly_data_section->unnamed.callback
track avr_need_copy_data_p only if not avr_rodata_in_flash_p().
(avr_asm_named_section): Track avr_has_rodata_p.
(avr_file_end): Emit __do_copy_data also when avr_has_rodata_p
and not avr_rodata_in_flash_p ().
* config/avr/specs.h (CC1_SPEC): Add %(cc1_rodata_in_ram).
(LINK_SPEC): Add %(link_rodata_in_ram).
(LINK_ARCH_SPEC): Remove.
* config/avr/gen-avr-mmcu-specs.cc (have_avrxmega3_rodata_in_flash)
(have_avrxmega2_flmap, have_avrxmega4_flmap): Set new static
const bool according to configure results.
(diagnose_mrodata_in_ram): New function.
(print_mcu): Generate specs with the following changes:
<*cc1_misc, *asm_misc, *link_misc>: New specs so that we don't
need to extend avr/specs.h each time we add a new bell or whistle.
<*cc1_rodata_in_ram, *link_rodata_in_ram>: New specs to diagnose
-m[no-]rodata-in-ram.
<*cpp_rodata_in_ram>: New. Does -D__AVR_RODATA_IN_RAM__=0/1.
<*cpp_mcu>: Add -D__AVR_AVR_FLMAP__ if it applies.
<*cpp>: Add %(cpp_rodata_in_ram).
<*link_arch>: Use emulation avrxmega2_flmap, avrxmega4_flmap as
requested.
<*self_spec>: Add -mflmap or %<mflmap as needed.
gcc/testsuite/
* gcc.target/avr/torture/pr112944-flmap-0.c: New test.
* gcc.target/avr/torture/pr112944-flmap-1.c: New test.
|
|
mips bootstraps have been broken for a while. They've been triggering an error
about mutually exclusive equal-tests always being false when building
gencondmd.
This was ultimately tracked down to the ior<mode>3_mips16_asmacro pattern. The
pattern uses the GPR mode iterator which looks like this:
(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
The condition for the pattern looks like this:
"ISA_HAS_MIPS16E2"
And if you dig into ISA_HAS_MIPS16E2:
/* The MIPS16e V2 instructions are available. */
&& !TARGET_64BIT)
The way the mode iterator is handled is by adding its condition to the
pattern's condition when we expand copies of the pattern resulting in this
condition for one of the two generated patterns:
(TARGET_MIPS16 && TARGET_MIPS16E2 && !TARGET_64BIT) && TARGET_64BIT
This can never be true because of the TARGET_64BIT tests.
The fix is trivial. Don't use a mode iterator on that pattern.
Bootstrapped on mips64el. I don't have any tests to compare against, so no
regression test data.
gcc/
* config/mips/mips.md (ior<mode>3_mips16_asmacro): Use SImode,
not the GPR iterator. Adjust pattern name and mode attribute
accordingly.
|