Age | Commit message (Collapse) | Author | Files | Lines |
|
The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64)
shows that ls64 is an optional extensions and should not be enabled by default
for Armv8.7-a.
This drops it from the mandatory bits for the architecture and brings GCC inline
with LLVM and the achitecture.
Note that we will not be changing binutils to preserve compatibility with older
released compilers.
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (AARCH64_ARCH): Remove LS64 from
Armv8.7-a.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/acle/ls64.C: Add +ls64.
* g++.target/aarch64/acle/ls64_lto.C: Likewise.
* gcc.target/aarch64/acle/ls64_lto.c: Likewise.
* gcc.target/aarch64/acle/pr110100.c: Likewise.
* gcc.target/aarch64/acle/pr110132.c: Likewise.
* gcc.target/aarch64/options_set_28.c: Drop check for nols64.
* gcc.target/aarch64/pragma_cpp_predefs_2.c: Correct header checks.
|
|
This test has never worked on AArch64 since the day it was committed. It has
a number of issues that prevent it from working on AArch64:
The testfailures seem to be known and triaged, so until that's fixed there's
no point in running this test.
gcc/testsuite/ChangeLog:
PR fortran/107071
* gfortran.dg/ieee/modes_1.f90: skip aarch64, arm.
|
|
The sequence to commit a lazy save includes a branch based on
whether TPIDR2_EL0 is zero. The code assumed that CBZ could
be used for this, but that instruction is forbidden when
-mtrack-speculation is being used.
gcc/
* config/aarch64/aarch64.cc (aarch64_mode_emit_local_sme_state):
Use aarch64_gen_compare_zero_and_branch rather than emitting
a CBZ directly.
gcc/testsuite/
* gcc.target/aarch64/sme/locally_streaming_1_ts.c: New test.
* gcc.target/aarch64/sme/sibcall_7_ts.c: Likewise.
|
|
I noticed while working on another patch that we had a duplicated
call to aarch64_process_target_attr.
gcc/
* config/aarch64/aarch64.cc (aarch64_option_valid_attribute_p):
Remove duplicated call.
|
|
In:
void bar() __arm_inout("za");
void foo() __arm_inout("za", "zt0") { bar(); }
foo cannot tail-call bar because foo needs to restore ZT0 after
the call. I'd forgotten to update the ok_for_sibcall rules
to handle this when adding SME2.
Thanks to Sander de Smalen for the spot.
gcc/
* config/aarch64/aarch64.cc (aarch64_function_ok_for_sibcall):
Check that each individual piece of state is shared in the same
way, rather than using an aggregate check for PSTATE.ZA.
gcc/testsuite/
* gcc.target/aarch64/sme/sibcall_9.c: New test.
|
|
ACLE guarantees that a function like:
__arm_new("zt0") foo() { ... }
will start with ZT0 equal to zero. I'd forgotten to enforce that
after commiting a lazy save. After such a save, we should zero
ZA iff the function has ZA state and zero ZT0 iff the function
has ZT0 state.
gcc/
* config/aarch64/aarch64.cc (aarch64_mode_emit_local_sme_state):
In the code that commits a lazy save, only zero ZA if the function
has ZA state. Similarly zero ZT0 if the function has ZT0 state.
gcc/testsuite/
* gcc.target/aarch64/sme/zt0_state_5.c (test3): Expect ZT0 rather
than ZA to be zeroed.
(test5): Remove zeroing of ZA.
|
|
The main purpose of the aarch64_commit_lazy_save pattern
was to defer insertion of a half-diamond until splitting,
since splitting knew how to create the associated basic blocks.
However, the fix for PR113220 means that mode-switching also
knows how to do that. This patch therefore removes the pattern
and emits the subinstructions directly.
On its own, this is actually a slight regression, since it
means we keep an unnecessary zero { za }. But the cases
where that happens are wrong for a different reason, and this
patch is a prerequisite to fixing it.
gcc/
* config/aarch64/aarch64-sme.md (aarch64_commit_lazy_save): Remove,
directly inserting the associated sequence
* config/aarch64/aarch64.cc (aarch64_mode_emit_local_sme_state):
...here instead.
gcc/testsuite/
* gcc.target/aarch64/sme/zt0_state_5.c (test3, test5): Expect
zero { za }s.
|
|
This patch fixes an ICE for a combination of:
- -fstack-clash-protection
- a frame that has SVE save slots
- a frame that has no GPR save slots
- a frame that has a VG save slot
The allocation code was folding the SVE save slot allocation into
the initial frame allocation, so that we had one allocation of
size <size of SVE registers> + 16. But the VG save code itself
expected the allocations to remain separate, since it wants to
store at a constant offset from SP or FP.
The VG save isn't shrink-wrapped and so acts as a probe of the
initial allocations. It should therefore be safe to keep separate
allocations in this case.
The scans in locally_streaming_1.c expect no stack clash protection,
so the patch forces that and adds a separate compile-only test for
when protection is enabled.
gcc/
PR target/113995
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Don't
fold the SVE allocation into the initial allocation if the
initial allocation includes a VG save.
gcc/testsuite/
PR target/113995
* gcc.target/aarch64/sme/locally_streaming_1.c: Require
-fno-stack-clash-protection.
* gcc.target/aarch64/sme/locally_streaming_1_scp.c: New test.
|
|
In this PR, the SME mode-switching code needs to insert a stack-probe
loop for an alloca. This patch allows the target to do that.
There are two parts to it: allowing loops for insertions in blocks,
and allowing them for insertions on edges. The former can be handled
entirely within mode-switching itself, by recording which blocks have
had new branches inserted. The latter requires an extension to
commit_one_edge_insertion.
I think the extension to commit_one_edge_insertion makes logical sense,
since it already explicitly allows internal loops during RTL expansion.
The single-block find_sub_basic_blocks is a relatively recent addition,
so wouldn't have been available when the code was originally written.
The patch also has a small and obvious fix to make the aarch64 emit
hook cope with labels.
I've added specific -fstack-clash-protection versions of all
aarch64-sme.exp tests that previously failed because of this bug.
I've also added -fno-stack-clash-protection to the original versions
of these tests if they contain scans that assume no protection.
gcc/
PR target/113220
* cfgrtl.cc (commit_one_edge_insertion): Handle sequences that
contain jumps even if called after initial RTL expansion.
* mode-switching.cc: Include cfgbuild.h.
(optimize_mode_switching): Allow the sequence returned by the
emit hook to contain internal jumps. Record which blocks
contain such jumps and split the blocks at the end.
* config/aarch64/aarch64.cc (aarch64_mode_emit): Check for
non-debug insns when scanning the sequence.
gcc/testsuite/
PR target/113220
* gcc.target/aarch64/sme/call_sm_switch_5.c: Add
-fno-stack-clash-protection.
* gcc.target/aarch64/sme/call_sm_switch_5_scp.c: New test.
* gcc.target/aarch64/sme/sibcall_6_scp.c: New test.
* gcc.target/aarch64/sme/za_state_4.c: Add
-fno-stack-clash-protection.
* gcc.target/aarch64/sme/za_state_4_scp.c: New test.
* gcc.target/aarch64/sme/za_state_5.c: Add
-fno-stack-clash-protection.
* gcc.target/aarch64/sme/za_state_5_scp.c: New test.
|
|
The main 'arch' context selector for nvptx is, well, 'nvptx';
however, as 'nvptx64' is used as by LLVM, it makes sense
to support it as well.
Note that LLVM has: "The triple architecture can be one of
``nvptx`` (32-bit PTX) or ``nvptx64`` (64-bit PTX)."
GCC effectively only supports the 64bit variant (at least for
offloading). Thus, GCC's 'nvptx' is not quite the same as LLVM's.
The device-compiler part (nvptx_omp_device_kind_arch_isa) uses
TARGET_ABI64 such that nvptx64 is only defined with -m64.
gcc/ChangeLog:
* config/nvptx/gen-omp-device-properties.sh: Add 'nvptx64' to arch.
* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Likewise.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Context Selectors): Add 'nvptx64' as additional
'arch' value for nvptx.
|
|
DSE, DCE, and other passes are removing redundant signaling comparisons
from these tests, but the whole point is to check that GCC knows how to
emit them. Use -fno-delete-dead-exceptions to prevent that.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zvector/autovec-double-signaling-eq.c:
Preserve exceptions.
* gcc.target/s390/zvector/autovec-float-signaling-eq.c:
Likewise.
|
|
The plan to maintain PRU hardware-specific specs in newlib tree has been
abandoned in favour of a new distinct GIT project. Update the
documentation accordingly.
gcc/ChangeLog:
* doc/invoke.texi (-mmcu): Add information about MCU specs.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
The minimal runtime has been documented from the beginning to break some
standard features in order to reduce code size, while keeping
the features required by typical firmware programs. Document one more
imposed restriction - the main() function must take no arguments.
gcc/ChangeLog:
* doc/invoke.texi (-minrt): Clarify that main
must take no arguments.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
gcc/analyzer/ChangeLog:
PR analyzer/113999
* analyzer.h (get_string_cst_size): New decl.
* region-model-manager.cc (get_string_cst_size): New.
(region_model_manager::maybe_get_char_from_string_cst): Treat
single-byte accesses within string_cst but beyond
TREE_STRING_LENGTH as being 0.
* region-model.cc (string_cst_has_null_terminator): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/113999
* c-c++-common/analyzer/strlen-pr113999.c: New test.
* gcc.dg/analyzer/strlen-1.c: More test coverage.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/analyzer/ChangeLog:
PR analyzer/113998
* ranges.cc (symbolic_byte_range::intersection): Handle empty ranges.
(selftest::test_intersects): Add test coverage for empty ranges.
gcc/testsuite/ChangeLog:
PR analyzer/113998
* c-c++-common/analyzer/overlapping-buffers-pr113998.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
|
|
PR fortran/105658
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_intrinsic_to_class): When passing an
array component reference of intrinsic type to a procedure
with an unlimited polymorphic dummy argument, a temporary
should be created.
gcc/testsuite/ChangeLog:
* gfortran.dg/PR105658.f90: New test.
Signed-off-by: Peter Hill <peter.hill@york.ac.uk>
|
|
The PR91865 combine fix changed instruction counts slightly for rlwinm-0.c.
Adjust expected instruction counts accordingly.
2024-02-20 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR target/112103
* gcc.target/powerpc/rlwinm-0.c: Adjust expected instruction counts.
|
|
The AVR built-ins used types like "int" or "char" that don't
have exact signedness or type size which depend on -mint8
and -f[no-][un-]signed-char etc. As the built-ins are modelling
machine instructions of given type sizes and signedness, also
use according types in their prototypes.
gcc/
* config/avr/builtins.def: Use function prototypes of given size
and signedness.
* config/avr/avr.cc (avr_init_builtins): Adjust types required
by builtins.def.
* doc/extend.texi (AVR Built-in Functions): Adjust accordingly.
|
|
gcc/rust/ChangeLog:
* rust-lang.cc (grs_langhook_type_for_mode): simplify code for
xImode. Add missing long_double_type_node.
Signed-off-by: Marc Poulhiès <dkm@kataplop.net>
|
|
gcc/
* doc/extend.texi (AVR Built-in Functions): Use @defbuiltin
instead of @table.
|
|
Add documentation describing the meaning and values for the -mcpu
command-line option.
Tested for bpf-unknown-none on x86_64-linux-gnu host.
gcc/ChangeLog:
* config/bpf/bpf.opt: Add help information for -mcpu.
Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
|
|
This patch makes -mtrack-speculation work on streaming-compatible
functions. There were two related issues. The first is that the
streaming-compatible code was using TB(N)Z unconditionally, whereas
those instructions are not allowed with speculation tracking.
That part can be fixed in a similar way to the recent eh_return
fix (PR112987).
The second issue was that the speculation-tracking pass runs
before some of the conditional branches are inserted. It isn't
safe to insert the branches any earlier, so the patch instead adds
a second speculation-tracking pass that runs afterwards. The new
pass is only used for streaming-compatible functions.
The testcase is adapted from call_sm_switch_1.c.
gcc/
PR target/113805
* config/aarch64/aarch64-passes.def (pass_late_track_speculation):
New pass.
* config/aarch64/aarch64-protos.h (make_pass_late_track_speculation):
Declare.
* config/aarch64/aarch64.md (is_call): New attribute.
(*and<mode>3nr_compare0): Rename to...
(@aarch64_and<mode>3nr_compare0): ...this.
* config/aarch64/aarch64-sme.md (aarch64_get_sme_state)
(aarch64_tpidr2_save, aarch64_tpidr2_restore): Add is_call attributes.
* config/aarch64/aarch64-speculation.cc: Update file comment to
describe the new late pass.
(aarch64_do_track_speculation): Handle is_call insns like other calls.
(pass_track_speculation): Add an is_late member variable.
(pass_track_speculation::gate): Run the late pass for streaming-
compatible functions and the early pass for other functions.
(make_pass_track_speculation): Update accordingly.
(make_pass_late_track_speculation): New function.
* config/aarch64/aarch64.cc (aarch64_gen_test_and_branch): New
function.
(aarch64_guard_switch_pstate_sm): Use it.
gcc/testsuite/
PR target/113805
* gcc.target/aarch64/sme/call_sm_switch_11.c: New test.
|
|
gcc/rust/ChangeLog:
* hir/rust-ast-lower-pattern.cc
(ASTLoweringPattern::visit):
Reset is_let_top_level while visiting GroupedPattern.
gcc/testsuite/ChangeLog:
* rust/compile/let_alt.rs: Check for false positive.
Signed-off-by: Owen Avery <powerboat9.gamer@gmail.com>
|
|
The signature was incorrectly using an i64 for the integer power,
instead of an i32.
gcc/testsuite/ChangeLog:
* rust/compile/torture/intrinsics-math.rs: Adjust powif64
intrinsic signature.
Signed-off-by: Marc Poulhiès <dkm@kataplop.net>
|
|
The testcase fails on i686-linux with
.../gcc/testsuite/gcc.dg/analyzer/torture/vector-extract-1.c:11:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi]
Added -Wno-psabi to silence the warning.
2024-02-20 Jakub Jelinek <jakub@redhat.com>
PR analyzer/113983
* gcc.dg/analyzer/torture/vector-extract-1.c: Add -Wno-psabi as
dg-additional-options.
|
|
target maybe_x32 doesn't check if platform has gnu/stubs-x32.h, but
it's included by stdint.h in the testcase.
Adjust testcase: remove stdint.h, use 'typedef long long int64_t'
instead.
gcc/testsuite/ChangeLog:
PR target/113711
* gcc.target/i386/apx-ndd-x32-1.c: Adjust testcase.
|
|
|
|
gcc/analyzer/ChangeLog:
PR analyzer/111289
* varargs.cc (representable_in_integral_type_p): New.
(va_arg_compatible_types_p): Add "arg_sval" param. Handle integer
types.
(kf_va_arg::impl_call_pre): Pass arg_sval to
va_arg_compatible_types_p.
gcc/testsuite/ChangeLog:
PR analyzer/111289
* c-c++-common/analyzer/stdarg-pr111289-int.c: New test.
* c-c++-common/analyzer/stdarg-pr111289-ptr.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/testsuite/ChangeLog:
PR analyzer/110520
* c-c++-common/analyzer/null-deref-pr110520.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
[PR113983]
After r14-6419-g4eaaf7f5a378e8, maybe_undo_optimize_bit_field_compare would ICE on
vector CST but this function really should be checking if we had integer types so
reject non-integral types early on (like it was doing for non-char type before r14-6419-g4eaaf7f5a378e8).
Committed as obvious after build and tested for aarch64-linux-gnu with no regressions.
PR analyzer/113983
gcc/analyzer/ChangeLog:
* region-model-manager.cc (maybe_undo_optimize_bit_field_compare): Reject
non integral types.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/torture/vector-extract-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
Currently, these are registered as unsigned_intDI_type_node which is not
necessarily the same type definition as uint64_t. On platforms where these
differ that causes fails in consuming the arm_acle.h header.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc (aarch64_init_rng_builtins):
Register these builtins with a pointer to uint64_t rather than unsigned
DI mode.
|
|
gcc/po/
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.
libcpp/po/
* be.po, ca.po, da.po, de.po, el.po, eo.po, es.po, fi.po, fr.po,
id.po, ja.po, ka.po, nl.po, pt_BR.po, ro.po, ru.po, sr.po, sv.po,
tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.
|
|
'!TARGET_RDNA2_PLUS' [PR113615]
On top of commit c7ec7bd1c6590cf4eed267feab490288e0b8d691
"amdgcn: add -march=gfx1030 EXPERIMENTAL" conditionalizing
'define_expand "reduc_<reduc_op>_scal_<mode>"' on
'!TARGET_RDNA2' (later: '!TARGET_RDNA2_PLUS'), we then did similar in
commit 7cc2262ec9a410dc56d1c1c6b950c922e14f621d
"gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615]"
to conditionalize 'define_expand "fold_left_plus_<mode>"' on
'!TARGET_RDNA2_PLUS', but I found we also need to conditionalize the related
'define_expand "reduc_<fexpander>_scal_<mode>"' on '!TARGET_RDNA2_PLUS', to
avoid ICEs like:
[...]/gcc.dg/vect/pr108608.c: In function 'foo':
[...]/gcc.dg/vect/pr108608.c:9:1: error: unrecognizable insn:
(insn 34 33 35 2 (set (reg:V64DF 723)
(unspec:V64DF [
(reg:V64DF 690 [ vect_m_11.20 ])
(const_int 1 [0x1])
] UNSPEC_MOV_DPP_SHR)) -1
(nil))
during RTL pass: vregs
Similar for 'gcc.dg/vect/vect-fmax-2.c', 'gcc.dg/vect/vect-fmin-2.c', and
'UNSPEC_SMAX_DPP_SHR' for 'gcc.dg/vect/vect-fmax-1.c', and
'UNSPEC_SMIN_DPP_SHR' for 'gcc.dg/vect/vect-fmin-1.c', when running 'vect.exp'
for 'check-gcc-c'.
PR target/113615
gcc/
* config/gcn/gcn-valu.md (define_expand "reduc_<fexpander>_scal_<mode>"):
Conditionalize on '!TARGET_RDNA2_PLUS'.
* config/gcn/gcn.cc (gcn_expand_dpp_shr_insn)
(gcn_expand_reduc_scalar):
'gcc_checking_assert (!TARGET_RDNA2_PLUS);'.
|
|
Also, add some safeguards for the future.
Fix-up for commit 52a2c659ae6c21f84b6acce0afcb9b93b9dc71a0
"GCN: Add pre-initial support for gfx1100".
gcc/
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Restore lost
'__gfx90a__' target CPU definition. Add some safeguards for the future.
|
|
When partially substituting a requires-expr, we don't want to perform
any additional checks beyond the substitution itself so as to minimize
checking requirements out of order. So don't check the return-type-req
of a compound-requirement during partial substitution. And don't check
the noexcept condition either since we can't do that on templated trees.
PR c++/113966
gcc/cp/ChangeLog:
* constraint.cc (tsubst_compound_requirement): Don't check
the noexcept condition or the return-type-requirement when
partially substituting.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-friend17.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to pick
(if any) the right __seg_{gs,fs} keyword based on target.
gcc/testsuite/ChangeLog:
* gcc.dg/bitint-86.c (__seg_gs): Replace with SEG MACRO.
|
|
The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.
It does so by pruning the local DEFs with LR_OUT of the block, removing
regs that can never be LR_IN (defined by this block) in the dominance
frontier.
PR rtl-optimization/54052
* rtl-ssa/blocks.cc (function_info::place_phis): Filter
local defs by LR_OUT.
|
|
definition module
This patch fixes a bug exposed when a constant string is declared in a
definition module and imported by a program module. The bug fix
was to defer the string assignment and concatenation until quadruples
were generated. The conststring symbol has a known field which
must be checked prior to retrieving the string contents.
gcc/m2/ChangeLog:
PR modula2/113889
* gm2-compiler/M2ALU.mod (StringFitsArray): Add tokeno parameter
to GetStringLength.
(InitialiseArrayOfCharWithString): Add tokeno parameter to
GetStringLength.
(CheckGetCharFromString): Add tokeno parameter to GetStringLength.
* gm2-compiler/M2Const.mod (constResolveViaMeta): Replace
PutConstString with PutConstStringKnown.
* gm2-compiler/M2GCCDeclare.mod (DeclareCharConstant): Add tokenno
parameter and add assert. Use tokenno to generate location.
(DeclareStringConstant): Add tokenno and add asserts.
Add tokenno parameter to calls to GetStringLength.
(PromoteToString): Add assert and add tokenno parameter to
GetStringLength.
(PromoteToCString): Add assert and add tokenno parameter to
GetStringLength.
(DeclareConstString): New procedure function.
(TryDeclareConst): Remove size local variable.
Check IsConstStringKnown.
Call DeclareConstString.
(PrintString): New procedure.
(PrintVerboseFromList): Call PrintString.
(CheckResolveSubrange): Check IsConstStringKnown before creating
subrange for char or issuing an error.
* gm2-compiler/M2GenGCC.mod (ResolveConstantExpressions): Add
StringLengthOp, StringConvertM2nulOp, StringConvertCnulOp case
clauses.
(FindSize): Add assert IsConstStringKnown.
(StringToChar): New variable tokenno.
Add tokenno parameter to GetStringLength.
(FoldStringLength): New procedure.
(FoldStringConvertM2nul): New procedure.
(FoldStringConvertCnul): New procedure.
(CodeAddr): Add tokenno parameter.
Replace CurrentQuadToken with tokenno.
Add tokenno parameter to GetStringLength.
(PrepareCopyString): Rewrite.
(IsConstStrKnown): New procedure function.
(FoldAdd): Detect conststring op2 and op3 which are known and
concat. Place result into op1.
(FoldStandardFunction): Pass tokenno as a parameter to
GetStringLength.
(CodeXIndr): Rewrite comment.
Rename op1 to left, op3 to right.
Pass rightpos to GetStringLength.
* gm2-compiler/M2Quads.def (QuadrupleOp): Add
StringConvertCnulOp, StringConvertM2nulOp and StringLengthOp.
* gm2-compiler/M2Quads.mod (import): Remove MakeConstLitString.
Add CopyConstString and PutConstStringKnown.
(IsInitialisingConst): Add StringConvertCnulOp,
StringConvertM2nulOp and StringLengthOp.
(callRequestDependant): Replace MakeConstLitString with
MakeConstString.
(DeferMakeConstStringCnul): New procedure function.
(DeferMakeConstStringM2nul): New procedure function.
(CheckParameter): Add early return if the string const is unknown.
(DescribeType): Add token parameter to GetStringLength.
Check for IsConstStringKnown.
(ManipulateParameters): Use DeferMakeConstStringCnul and
DeferMakeConstStringM2nul.
(MakeLengthConst): Remove and replace with...
(DeferMakeLengthConst): ... this.
(doBuildBinaryOp): Create ConstString and set it to contents
unknown.
Check IsConstStringKnown before generating error message.
(WriteQuad): Add StringConvertCnulOp, StringConvertM2nulOp and
StringLengthOp.
(WriteOperator): Add StringConvertCnulOp, StringConvertM2nulOp and
StringLengthOp.
* gm2-compiler/M2SymInit.mod (CheckReadBeforeInitQuad): Add
StringConvertCnulOp, StringConvertM2nulOp and StringLengthOp.
* gm2-compiler/NameKey.mod (LengthKey): Allow NulName to return 0.
* gm2-compiler/P2SymBuild.mod (BuildString): Replace
MakeConstLitString with MakeConstString.
(DetermineType): Replace PutConstString with PutConstStringKnown.
* gm2-compiler/SymbolTable.def (MakeConstVar): Tidy up comment.
(MakeConstLitString): Remove.
(MakeConstString): New procedure function.
(MakeConstStringCnul): New procedure function.
(MakeConstStringM2nul): New procedure function.
(PutConstStringKnown): New procedure.
(CopyConstString): New procedure.
(IsConstStringKnown): New procedure function.
(IsConstStringM2): New procedure function.
(IsConstStringC): New procedure function.
(IsConstStringM2nul): New procedure function.
(IsConstStringCnul): New procedure function.
(GetStringLength): Add token parameter.
(PutConstString): Remove.
(GetConstStringM2): Remove.
(GetConstStringC): Remove.
(GetConstStringM2nul): Remove.
(GetConstStringCnul): Remove.
(MakeConstStringC): Remove.
* gm2-compiler/SymbolTable.mod (SymConstString): Remove
M2Variant, NulM2Variant, CVariant, NulCVariant.
Add Known.
(CheckAnonymous): Replace $$ with __anon.
(IsNameAnonymous): Replace $$ with __anon.
(MakeConstVar): Detect whether the name is nul and treat as
a temporary constant.
(MakeConstLitString): Remove.
(BackFillString): Remove.
(InitConstString): Rewrite.
(GetConstStringM2): Remove.
(GetConstStringC): Remove.
(GetConstStringContent): New procedure function.
(GetConstStringM2nul): Remove.
(GetConstStringCnul): Remove.
(MakeConstStringCnul): Rewrite.
(MakeConstStringM2nul): Rewrite.
(MakeConstStringC): Remove.
(MakeConstString): Rewrite.
(PutConstStringKnown): New procedure.
(CopyConstString): New procedure.
(PutConstString): Remove.
(IsConstStringKnown): New procedure function.
(IsConstStringM2): New procedure function.
(IsConstStringC): Rewrite.
(IsConstStringM2nul): Rewrite.
(IsConstStringCnul): Rewrite.
(GetConstStringKind): New procedure function.
(GetString): Check Known.
(GetStringLength): Add token parameter and check Known.
gcc/testsuite/ChangeLog:
PR modula2/113889
* gm2/pim/run/pass/pim-run-pass.exp: Add filter for
constdef.mod.
* gm2/extensions/run/pass/callingc2.mod: New test.
* gm2/extensions/run/pass/callingc3.mod: New test.
* gm2/extensions/run/pass/callingc4.mod: New test.
* gm2/extensions/run/pass/callingc5.mod: New test.
* gm2/extensions/run/pass/callingc6.mod: New test.
* gm2/extensions/run/pass/callingc7.mod: New test.
* gm2/extensions/run/pass/callingc8.mod: New test.
* gm2/extensions/run/pass/fixedarray.mod: New test.
* gm2/extensions/run/pass/fixedarray2.mod: New test.
* gm2/pim/run/pass/constdef.def: New test.
* gm2/pim/run/pass/constdef.mod: New test.
* gm2/pim/run/pass/testimportconst.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Some of these are part of the upstream DMD `gdc.test' testsuite, but
they had been omitted because they get mangled by the lib/gdc-utils.exp
helpers when parsing and staging the tests. Translate them over to the
gdc.dg testsuite instead.
gcc/testsuite/ChangeLog:
* gdc.dg/bom_UTF16BE.d: New test.
* gdc.dg/bom_UTF16LE.d: New test.
* gdc.dg/bom_UTF32BE.d: New test.
* gdc.dg/bom_UTF32LE.d: New test.
* gdc.dg/bom_UTF8.d: New test.
* gdc.dg/bom_characters.d: New test.
* gdc.dg/bom_error_UTF8.d: New test.
* gdc.dg/bom_infer_UTF16BE.d: New test.
* gdc.dg/bom_infer_UTF16LE.d: New test.
* gdc.dg/bom_infer_UTF32BE.d: New test.
* gdc.dg/bom_infer_UTF32LE.d: New test.
* gdc.dg/bom_infer_UTF8.d: New test.
|
|
The following testcase ICEs, because BIT_FIELD_REF's position is not
multiple of the vector element's bit size and the code uses exact_div
to divide those 2 values.
For BIT_INSERT_EXPR, the tree-cfg.cc verification verifies the position
is a multiple of the inserted bit size when inserting into vectors,
but for BIT_FIELD_REF the position can be arbitrary if within the range.
The following patch fixes that.
2024-02-19 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113967
* match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): Require
in condition that @rpos is multiple of vector element size.
* gcc.dg/pr113967.c: New test.
|
|
Update in v2: Add dump information.
This patch fixes the following ineffective vsetvl insertion:
void f (int32_t * restrict in, int32_t * restrict out, size_t n, size_t cond, size_t cond2)
{
for (size_t i = 0; i < n; i++)
{
if (i == cond) {
vint8mf8_t v = *(vint8mf8_t*)(in + i + 100);
*(vint8mf8_t*)(out + i + 100) = v;
} else if (i == cond2) {
vfloat32mf2_t v = *(vfloat32mf2_t*)(in + i + 200);
*(vfloat32mf2_t*)(out + i + 200) = v;
} else if (i == (cond2 - 1)) {
vuint16mf2_t v = *(vuint16mf2_t*)(in + i + 300);
*(vuint16mf2_t*)(out + i + 300) = v;
} else {
vint8mf4_t v = *(vint8mf4_t*)(in + i + 400);
*(vint8mf4_t*)(out + i + 400) = v;
}
}
}
Before this patch:
f:
.LFB0:
.cfi_startproc
beq a2,zero,.L12
addi a7,a0,400
addi a6,a1,400
addi a0,a0,1600
addi a1,a1,1600
li a5,0
addi t6,a4,-1
vsetvli t3,zero,e8,mf8,ta,ma ---> ineffective uplift
.L7:
beq a3,a5,.L15
beq a4,a5,.L16
beq t6,a5,.L17
vsetvli t1,zero,e8,mf4,ta,ma
vle8.v v1,0(a0)
vse8.v v1,0(a1)
vsetvli t3,zero,e8,mf8,ta,ma
.L4:
addi a5,a5,1
addi a7,a7,4
addi a6,a6,4
addi a0,a0,4
addi a1,a1,4
bne a2,a5,.L7
.L12:
ret
.L15:
vle8.v v1,0(a7)
vse8.v v1,0(a6)
j .L4
.L17:
vsetvli t1,zero,e8,mf4,ta,ma
addi t5,a0,-400
addi t4,a1,-400
vle16.v v1,0(t5)
vse16.v v1,0(t4)
vsetvli t3,zero,e8,mf8,ta,ma
j .L4
.L16:
addi t5,a0,-800
addi t4,a1,-800
vle32.v v1,0(t5)
vse32.v v1,0(t4)
j .L4
It's obvious that we are hoisting the e8mf8 vsetvl to the top. It's ineffective since e8mf8 comes from
low probability block which is if (i == cond).
For this case, we disable such fusion.
After this patch:
f:
beq a2,zero,.L12
addi a7,a0,400
addi a6,a1,400
addi a0,a0,1600
addi a1,a1,1600
li a5,0
addi t6,a4,-1
.L7:
beq a3,a5,.L15
beq a4,a5,.L16
beq t6,a5,.L17
vsetvli t1,zero,e8,mf4,ta,ma
vle8.v v1,0(a0)
vse8.v v1,0(a1)
.L4:
addi a5,a5,1
addi a7,a7,4
addi a6,a6,4
addi a0,a0,4
addi a1,a1,4
bne a2,a5,.L7
.L12:
ret
.L15:
vsetvli t3,zero,e8,mf8,ta,ma
vle8.v v1,0(a7)
vse8.v v1,0(a6)
j .L4
.L17:
addi t5,a0,-400
addi t4,a1,-400
vsetvli t1,zero,e8,mf4,ta,ma
vle16.v v1,0(t5)
vse16.v v1,0(t4)
j .L4
.L16:
addi t5,a0,-800
addi t4,a1,-800
vsetvli t3,zero,e32,mf2,ta,ma
vle32.v v1,0(t5)
vse32.v v1,0(t4)
j .L4
Tested on both RV32/RV64 no regression. Ok for trunk ?
PR target/113696
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info):
Suppress vsetvl fusion.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr113696.c: New test.
|
|
|
|
Since push2/pop2 requires 16-byte stack alignment, don't generate them
if the incoming stack isn't 16-byte aligned.
gcc/
PR target/113912
* config/i386/i386.cc (ix86_can_use_push2pop2): New.
(ix86_pro_and_epilogue_can_use_push2pop2): Use it.
(ix86_emit_save_regs): Don't generate push2 if
ix86_can_use_push2pop2 return false.
(ix86_expand_epilogue): Don't generate pop2 if
ix86_can_use_push2pop2 return false.
gcc/testsuite/
PR target/113912
* gcc.target/i386/apx-push2pop2-2.c: New test.
|
|
gcc/
* doc/invoke.texi (AVR Options) <-mmcu>: Remove "Atmel".
Note on complete device support.
|
|
gcc/
* doc/extend.texi (AVR Function Attributes): Fuse description
of "signal" and "interrupt" attribute. Link pseudo instruction.
|
|
When not optimized for speed, the test for PR112344 takes several
seconds to execute on native x86_64, and 15 minutes on PRU target
simulator. Thus mark those variants as expensive. The -O2 variant
which originally triggered the PR is not expensive, hence it is
still run by default.
PR middle-end/112344
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr112344.c: Run non-optimized variants only
if expensive tests are allowed.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
gcc/ChangeLog:
* config/loongarch/larchintrin.h (__movgr2fcsr): Remove redundant
symbol type conversions.
(__cacop_d): Likewise.
(__cpucfg): Likewise.
(__asrtle_d): Likewise.
(__asrtgt_d): Likewise.
(__lddir_d): Likewise.
(__ldpte_d): Likewise.
(__crc_w_b_w): Likewise.
(__crc_w_h_w): Likewise.
(__crc_w_w_w): Likewise.
(__crc_w_d_w): Likewise.
(__crcc_w_b_w): Likewise.
(__crcc_w_h_w): Likewise.
(__crcc_w_w_w): Likewise.
(__crcc_w_d_w): Likewise.
(__csrrd_w): Likewise.
(__csrwr_w): Likewise.
(__csrxchg_w): Likewise.
(__csrrd_d): Likewise.
(__csrwr_d): Likewise.
(__csrxchg_d): Likewise.
(__iocsrrd_b): Likewise.
(__iocsrrd_h): Likewise.
(__iocsrrd_w): Likewise.
(__iocsrrd_d): Likewise.
(__iocsrwr_b): Likewise.
(__iocsrwr_h): Likewise.
(__iocsrwr_w): Likewise.
(__iocsrwr_d): Likewise.
(__frecipe_s): Likewise.
(__frecipe_d): Likewise.
(__frsqrte_s): Likewise.
(__frsqrte_d): Likewise.
|
|
gcc/ChangeLog:
* config/loongarch/larchintrin.h (__iocsrrd_h): Modify the
function return value type to unsigned short.
|
|
|