Age | Commit message (Collapse) | Author | Files | Lines |
|
When compiling libgcc or on e.g.
int a[64];
int p;
void
foo (void)
{
int s = 1;
while (p)
{
s -= 11;
a[s] != 0;
}
}
sccvn invokes UB in the compiler as detected by ubsan:
../../gcc/poly-int.h:1089:5: runtime error: left shift of negative value -40
The problem is that we still use C++11..C++17 as the implementation language
and in those C++ versions shifting negative values left is UB (well defined
since C++20) and above in
offset += op->off << LOG2_BITS_PER_UNIT;
op->off is poly_int64 with -40 value (in libgcc with -8).
I understand the offset_int << LOG2_BITS_PER_UNIT shifts but it is then well
defined during underlying implementation which is done on the uhwi limbs,
but for poly_int64 we use
offset += pop->off * BITS_PER_UNIT;
a few lines earlier and I think that is both more readable in what it
actually does and triggers UB only if there would be signed multiply
overflow. In the end, the compiler will treat them the same at least at the
RTL level (at least, if not and they aren't the same cost, it should).
2024-03-07 Jakub Jelinek <jakub@redhat.com>
PR middle-end/105533
* tree-ssa-sccvn.cc (ao_ref_init_from_vn_reference) <case ARRAY_REF>:
Multiple op->off by BITS_PER_UNIT instead of shifting it left by
LOG2_BITS_PER_UNIT.
|
|
In simd_correctness_check.h, the role of the macro ASSERTEQ_64 is to check the
result of the passed vector values for the 64-bit data of each array element.
It turns out that it uses the abs() function to check only the lower 32 bits
of the data at a time, so it replaces abs() with the llabs() function.
However, the following two problems may occur after modification:
1.FAIL in lasx-xvfrint_s.c and lsx-vfrint_s.c
The reason for the error is because vector test cases that use __m{128,256} to
define vector types are composed of 32-bit primitive types, they should use
ASSERTEQ_32 instead of ASSERTEQ_64 to check for correctness.
2.FAIL in lasx-xvshuf_b.c and lsx-vshuf.c
The cause of the error is that the expected result of the function setting in
the test case is incorrect.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vector/lasx/lasx-xvfrint_s.c: Replace
ASSERTEQ_64 with the macro ASSERTEQ_32.
* gcc.target/loongarch/vector/lasx/lasx-xvshuf_b.c: Modify the expected
test results of some functions according to the function of the vector
instruction.
* gcc.target/loongarch/vector/lsx/lsx-vfrint_s.c: Same
modification as lasx-xvfrint_s.c.
* gcc.target/loongarch/vector/lsx/lsx-vshuf.c: Same
modification as lasx-xvshuf_b.c.
* gcc.target/loongarch/vector/simd_correctness_check.h: Use the llabs()
function instead of abs() to check the correctness of the results.
|
|
gcc/ChangeLog:
* config.gcc: Add a case for loongarch*-*-linux-musl*.
* config/loongarch/linux.h: Disable the multilib-compatible
treatment for *musl* targets.
* config/loongarch/musl.h: New file.
|
|
The following patch attempts to fix an optimization regression through
adding a simple simplification. We already have the
/* (m1 CMP m2) * d -> (m1 CMP m2) ? d : 0 */
(if (!canonicalize_math_p ())
(for cmp (tcc_comparison)
(simplify
(mult:c (convert (cmp@0 @1 @2)) @3)
(if (INTEGRAL_TYPE_P (type)
&& INTEGRAL_TYPE_P (TREE_TYPE (@0)))
(cond @0 @3 { build_zero_cst (type); })))
optimization which otherwise triggers during the a * !a multiplication,
but that is done only late and we aren't able through range assumptions
optimize it yet anyway.
The patch adds a specific simplification for it.
If a is zero, then a * !a will be 0 * 1 (or for signed 1-bit 0 * -1)
and so 0.
If a is non-zero, then a * !a will be a * 0 and so again 0.
THe pattern is valid for scalar integers, complex integers and vector types,
but I think will actually trigger only for the scalar integers. For
vector types I've added other two with VEC_COND_EXPR in it, for complex
there are different GENERIC trees to match and it is something that likely
would be never matched in GIMPLE, so I didn't handle that.
2024-03-07 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114009
* genmatch.cc (decision_tree::gen): Emit ARG_UNUSED for captures
argument even for GENERIC, not just for GIMPLE.
* match.pd (a * !a -> 0): New simplifications.
* gcc.dg/tree-ssa/pr114009.c: New test.
|
|
There are two expand_vec_cmp functions.
They have same structure and similar code.
We can use default arguments instead of overloading.
Tested on RV32 and RV64.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (expand_vec_cmp): Change proto
* config/riscv/riscv-v.cc (expand_vec_cmp): Use default arguments
(expand_vec_cmp_float): Adapt arguments
Signed-off-by: demin.han <demin.han@starfivetech.com>
|
|
The previous patch used snprintf to set the message
string. The message string is not a formatted string
and the snprintf will interpret '%' related characters
as format specifiers when there are no associated
output variables. A segfault ensues.
This change replaces snprintf with a fortran string copy
function and null terminates the message string.
PR libfortran/105456
libgfortran/ChangeLog:
* io/list_read.c (list_formatted_read_scalar): Use fstrcpy
from libgfortran/runtime/string.c to replace snprintf.
(nml_read_obj): Likewise.
* io/transfer.c (unformatted_read): Likewise.
(unformatted_write): Likewise.
(formatted_transfer_scalar_read): Likewise.
(formatted_transfer_scalar_write): Likewise.
* io/write.c (list_formatted_write_scalar): Likewise.
(nml_write_obj): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr105456.f90: Revise using '%' characters
in users error message.
|
|
|
|
optimize_function_for_size_p predicate is not stable during optab selection,
because it also depends on node->count/node->frequency of the current function,
which are updated during IPA, so they may change between early opts and
late opts. Use optimize_size instead - optimize_size implies
optimize_function_for_size_p (cfun), so if a named pattern uses
"&& optimize_size" and the insn it splits into uses
optimize_function_for_size_p (cfun), it shouldn't fail.
PR target/114232
gcc/ChangeLog:
* config/i386/mmx.md (negv2qi2): Enable for optimize_size instead
of optimize_function_for_size_p. Explictily enable for TARGET_SSE2.
(negv2qi SSE reg splitter): Enable for TARGET_SSE2 only.
(<plusminus:insn>v2qi3): Enable for optimize_size instead
of optimize_function_for_size_p. Explictily enable for TARGET_SSE2.
(<plusminus:insn>v2qi SSE reg splitter): Enable for TARGET_SSE2 only.
(<any_shift:insn>v2qi3): Enable for optimize_size instead
of optimize_function_for_size_p.
|
|
Three-operand instructions like vmacc are modeled with an implicit
output reload when the output does not match one of the operands. For
this we use vmv.v.v which is subject to length masking.
In a situation where the current vl is less than the full vlenb
and the fma's result value is used as input for a vector reduction
(which is never length masked) we effectively only reduce vl
elements. The masked-out elements are relevant for the
reduction, though, leading to a wrong result.
This patch replaces the vmv reloads by full-register reloads.
gcc/ChangeLog:
PR target/114200
PR target/114202
* config/riscv/vector.md: Use vmv[1248]r.v instead of vmv.v.v.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr114200.c: New test.
* gcc.target/riscv/rvv/autovec/pr114202.c: New test.
|
|
Scalar loads provide offset addressing while unit-stride vector
instructions cannot. The offset must be loaded into a general-purpose
register before it can be used. In order to account for this, this
patch adds an address arithmetic heuristic that keeps track of data
reference operands. If we haven't seen the operand before we add the
cost of a scalar statement.
This helps to get rid of an lbm regression when vectorizing (roughly
0.5% fewer dynamic instructions). gcc5 improves by 0.2% and deepsjeng
by 0.25%. wrf and nab degrade by 0.1%. This is because before we now
adjust the cost of SLP as well as loop-vectorized instructions whereas
we would only adjust loop-vectorized instructions before.
Considering higher scalar_to_vec costs (3 vs 1) for all vectorization
types causes some snippets not to get vectorized anymore. Given these
costs the decision looks correct but appears worse when just counting
dynamic instructions.
In total SPECint 2017 has 4 bln dynamic instructions less and SPECfp 0.7
bln.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Move...
(costs::adjust_stmt_cost): ... to here and add vec_load/vec_store
offset handling.
(costs::add_stmt_cost): Also adjust cost for statements without
stmt_info.
* config/riscv/riscv-vector-costs.h: Define zero constant.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/costmodel/riscv/rvv/vse-slp-1.c: New test.
* gcc.dg/vect/costmodel/riscv/rvv/vse-slp-2.c: New test.
|
|
By default most patterns can be conditionalized on Arm targets. However
Thumb-2 predication requires the "predicable" attribute be explicitly
set to "yes". Most patterns are shared between Arm and Thumb(-2) and are
marked with "predicable". Given this sharing, it does not make sense to
use a different default for Arm. So only consider conditional execution
of instructions that have the predicable attribute set to yes. This ensures
that patterns not explicitly marked as such are never conditionally executed.
gcc/ChangeLog:
PR target/113915
* config/arm/arm.md (NOCOND): Improve comment.
(arm_rev*) Add predicable.
* config/arm/arm.cc (arm_final_prescan_insn): Add check for
PREDICABLE_YES.
gcc/testsuite/ChangeLog:
PR target/113915
* gcc.target/arm/builtin-bswap-1.c: Fix test to allow conditional
execution both for Arm and Thumb-2.
|
|
This bug totally fell off my radar. Sorry about that.
We have some special casing the conditional move expander to simplify a
conditional move when comparing a register against zero and that same register
is one of the arms.
Specifically a (eq (reg) (const_int 0)) where reg is also the true arm or (ne
(reg) (const_int 0)) where reg is the false arm need not use the fully
generalized conditional move, thus saving an instruction for those cases.
In the NE case we swapped the operands, but didn't swap the condition, which
led to the ICE due to an unrecognized pattern. THe backend actually has
distinct patterns for those two cases. So swapping the operands is neither
needed nor advisable.
Regression tested on rv64gc and verified the new tests pass.
Pushing to the trunk.
PR target/113001
PR target/112871
gcc/
* config/riscv/riscv.cc (expand_conditional_move): Do not swap
operands when the comparison operand is the same as the false
arm for a NE test.
gcc/testsuite
* gcc.target/riscv/zicond-ice-3.c: New test.
* gcc.target/riscv/zicond-ice-4.c: New test.
|
|
When an exception is encountered during simplification of arithmetic
expressions, the result may depend on whether range-checking is active
(-frange-check) or not. However, the code path in the front-end should
stay the same for "soft" errors for which the exception is triggered by the
check, while "hard" errors should always terminate the simplification, so
that error recovery is independent of the flag. Separation of arithmetic
error codes into "hard" and "soft" errors shall be done consistently via
is_hard_arith_error().
PR fortran/103707
PR fortran/106987
gcc/fortran/ChangeLog:
* arith.cc (is_hard_arith_error): New helper function to determine
whether an arithmetic error is "hard" or not.
(check_result): Use it.
(gfc_arith_divide): Set "Division by zero" only for regular
numerators of real and complex divisions.
(reduce_unary): Use is_hard_arith_error to determine whether a hard
or (recoverable) soft error was encountered. Terminate immediately
on hard error, otherwise remember code of first soft error.
(reduce_binary_ac): Likewise.
(reduce_binary_ca): Likewise.
(reduce_binary_aa): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr99350.f90:
* gfortran.dg/arithmetic_overflow_3.f90: New test.
|
|
Here we ICE because we call register_local_specialization while
local_specializations is null, so
local_specializations->put ();
crashes on null this. It's null since maybe_instantiate_noexcept calls
push_to_top_level which creates a new scope. Normally, I would have
guessed that we need a new local_specialization_stack. But here we're
dealing with an operand of a noexcept, which is an unevaluated operand,
and those aren't registered in the hash map. maybe_instantiate_noexcept
wasn't signalling that it's substituting an unevaluated operand though.
PR c++/114114
gcc/cp/ChangeLog:
* pt.cc (maybe_instantiate_noexcept): Save/restore
cp_unevaluated_operand, c_inhibit_evaluation_warnings, and
cp_noexcept_operand around the tsubst_expr call.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept84.C: New test.
|
|
Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move and
use generic code instead.
No functional changes.
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_move) [TARGET_MACHO]:
Eliminate common code and use generic code instead.
|
|
The "SDWA" changes in commit 99890e15527f1f04caef95ecdd135c9f1a077f08
"amdgcn: additional gfx1030/gfx1100 support" caused a few regressions:
PASS: gcc.target/gcn/sram-ecc-3.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-3.c scan-assembler zero_extendv64qiv64si2
PASS: gcc.target/gcn/sram-ecc-4.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-4.c scan-assembler zero_extendv64hiv64si2
PASS: gcc.target/gcn/sram-ecc-7.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-7.c scan-assembler zero_extendv64qiv64si2
PASS: gcc.target/gcn/sram-ecc-8.c (test for excess errors)
[-PASS:-]{+FAIL:+} gcc.target/gcn/sram-ecc-8.c scan-assembler zero_extendv64hiv64si2
Those test cases need corresponding adjustment.
gcc/testsuite/
* gcc.target/gcn/sram-ecc-3.c: Adjust.
* gcc.target/gcn/sram-ecc-4.c: Likewise.
* gcc.target/gcn/sram-ecc-7.c: Likewise.
* gcc.target/gcn/sram-ecc-8.c: Likewise.
|
|
gcc/
* config/avr/avr.cc (avr_rtx_costs_1) [PLUS+ZERO_EXTEND]: Adjust
rtx cost.
|
|
The following reworks vectorizable_live_operation to pass the
live stmt to vect_create_epilog_for_reduction also for early breaks
and a peeled main exit. This is to be able to figure the scalar
definition to replace. This reverts the PR114192 fix as it is
subsumed by this cleanup.
PR tree-optimization/114239
* tree-vect-loop.cc (vect_get_vect_def): Remove.
(vect_create_epilog_for_reduction): The passed in stmt_info
should now be the live stmt that produces the scalar reduction
result. Revert PR114192 fix. Base reduction info off
info_for_reduction. Remove special handling of
early-break/peeled, restore original vector def gathering.
Make sure to pick the correct exit PHIs.
(vectorizable_live_operation): Pass in the proper stmt_info
for early break exits.
* gcc.dg/vect/vect-early-break_122-pr114239.c: New testcase.
|
|
Loops on named vector register are not vectorized (see comment 11 of
PR113622), so the these test cases have been failing for a while.
Rewrite them using check-function-bodies to remove hard coding register
names. A barrier is needed to always load the first operand before the
second operand.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/vfcmp-f.c: Rewrite to avoid named
registers.
* gcc.target/loongarch/vfcmp-d.c: Likewise.
* gcc.target/loongarch/xvfcmp-f.c: Likewise.
* gcc.target/loongarch/xvfcmp-d.c: Likewise.
|
|
While reworking the aarch64 feature descriptions, I forgot
to add out-of-class definitions of some static constants.
This could lead to a build failure with some compilers.
This was seen with some WIP to increase the number of extensions
beyond 64. It's latent on trunk though, and a regression from
before the rework.
gcc/
* config/aarch64/aarch64-feature-deps.h (feature_deps::info): Add
out-of-class definitions of static constants.
|
|
[PR113629]
Unification for conversion operators (DEDUCE_CONV) doesn't perform
transformations like handling forwarding references. This is correct in
general, but not for xobj parameters, which should be handled "normally"
for the purposes of deduction: [temp.deduct.conv] only applies to the
return type of the conversion function.
PR c++/113629
gcc/cp/ChangeLog:
* pt.cc (type_unification_real): Only use DEDUCE_CONV for the
return type of a conversion function.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/explicit-obj-conv-op.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
When we scrap the last def of an odd lane numbered BB reduction
we can end up recording a pattern def which will later wreck
code generation. The following puts this logic where it better
belongs, avoiding this issue.
PR tree-optimization/114249
* tree-vect-slp.cc (vect_build_slp_instance): Move making
a BB reduction lane number even ...
(vect_slp_check_for_roots): ... here to avoid leaking
pattern defs.
* gcc.dg/vect/bb-slp-pr114249.c: New testcase.
|
|
The following makes sure to strip type conversions added by
build_fold_addr_expr before placing the result in a call argument.
PR tree-optimization/114246
* tree-ssa-dse.cc (increment_start_addr): Strip useless
type conversions from the adjusted address.
* gcc.dg/torture/pr114246.c: New testcase.
|
|
When writing the rest_of_handle_insert_vzeroupper workaround to manually
remove all the REG_DEAD/REG_UNUSED notes from the IL, I've missed that
there is a df_analyze () call right after it and that the problems added
earlier in the pass, like df_note_add_problem () done during mode switching,
doesn't affect just the next df_analyze () call right after it, but all
other df_analyze () calls until the end of the current pass where
df_finish_pass removes the optional problems.
So, as can be seen on the following patch, the workaround doesn't actually
work there, because while rest_of_handle_insert_vzeroupper carefully removes
all REG_DEAD/REG_UNUSED notes, the df_analyze () call at the end of the
function immediately adds them in again (so, I must say I have no idea
why the workaround worked on the earlier testcases).
Now, I could move the df_analyze () call just before the REG_DEAD/REG_UNUSED
note removal loop, but I think the following patch is better, because
the df_analyze () call doesn't have to recompute the problem when we don't
care about it and will actively strip all traces of it away.
2024-03-06 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/114190
* config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper):
Call df_remove_problem for df_note before calling df_analyze.
* gcc.target/i386/avx-pr114190.c: New test.
|
|
The defines IOMSG_LEN and MSGLEN were redundant so these are combined
into IOMSG_LEN as defined in io.h.
The remainder of the patch adds checks for when a user defined
derived type IO procedure sets the IOSTAT or IOMSG variables
independent of the librrary defined I/O messages.
PR libfortran/105456
libgfortran/ChangeLog:
* io/io.h (IOMSG_LEN): Moved to here.
* io/list_read.c (MSGLEN): Removed MSGLEN.
(convert_integer): Changed MSGLEN to IOMSG_LEN.
(parse_repeat): Likewise.
(read_logical): Likewise.
(read_integer): Likewise.
(read_character): Likewise.
(parse_real): Likewise.
(read_complex): Likewise.
(read_real): Likewise.
(check_type): Likewise.
(list_formatted_read_scalar): Adjust to IOMSG_LEN.
(nml_read_obj): Add user defined error message.
* io/transfer.c (unformatted_read): Add user defined error
message.
(unformatted_write): Add user defined error message.
(formatted_transfer_scalar_read): Add user defined error message.
(formatted_transfer_scalar_write): Add user defined error message.
* io/write.c (list_formatted_write_scalar): Add user defined error message.
(nml_write_obj): Add user defined error message.
gcc/testsuite/ChangeLog:
* gfortran.dg/pr105456-nmlr.f90: New test.
* gfortran.dg/pr105456-nmlw.f90: New test.
* gfortran.dg/pr105456-ruf.f90: New test.
* gfortran.dg/pr105456-wf.f90: New test.
* gfortran.dg/pr105456-wuf.f90: New test.
|
|
Here the TEMPLATE_DECL representing the template friend declaration
naming B has class scope since the template B has class scope, but
get_merge_kind assumes all DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P
TEMPLATE_DECL have namespace scope and wrongly returns MK_named instead
of MK_local_friend for the friend.
gcc/cp/ChangeLog:
* module.cc (trees_out::get_merge_kind) <case depset::EK_DECL>:
Accomodate class-scope DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P
TEMPLATE_DECL. Consolidate IDENTIFIER_ANON_P cases.
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-7.h: New test.
* g++.dg/modules/friend-7_a.H: New test.
* g++.dg/modules/friend-7_b.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
|
|
PR debug/114186
DWARF DIEs of type DW_TAG_subrange_type are linked together to represent
the information about the subsequent dimensions. The CTF processing was
so far working through them in the opposite (incorrect) order.
While fixing the issue, refactor the code a bit for readability.
co-authored-By: Indu Bhagat <indu.bhagat@oracle.com>
gcc/
PR debug/114186
* dwarf2ctf.cc (gen_ctf_array_type): Invoke the ctf_add_array ()
in the correct order of the dimensions.
(gen_ctf_subrange_type): Refactor out handling of
DW_TAG_subrange_type DIE to here.
gcc/testsuite/
PR debug/114186
* gcc.dg/debug/ctf/ctf-array-6.c: Add test.
|
|
This patch makes the expansion of IFN_ASAN_MARK let through
poly-int-sized objects. The expansion itself was already generic
enough, but the tests for the fast path were too strict.
gcc/
PR sanitizer/97696
* asan.cc (asan_expand_mark_ifn): Allow the length to be a poly_int.
gcc/testsuite/
PR sanitizer/97696
* gcc.target/aarch64/sve/pr97696.c: New test.
|
|
I was over-eager when adding support for strided SME2 instructions
and accidentally included forms of LUTI2 and LUTI4 that are only
available with SME2.1, not SME2. This patch removes them for now.
We're planning to add proper support for SME2.1 in the GCC 15
timeframe.
Sorry for the blunder :(
gcc/
* config/aarch64/aarch64.md (stride_type): Remove luti_consecutive
and luti_strided.
* config/aarch64/aarch64-sme.md
(@aarch64_sme_lut<LUTI_BITS><mode>): Remove stride_type attribute.
(@aarch64_sme_lut<LUTI_BITS><mode>_strided2): Delete.
(@aarch64_sme_lut<LUTI_BITS><mode>_strided4): Likewise.
* config/aarch64/aarch64-early-ra.cc (is_stride_candidate)
(early_ra::maybe_convert_to_strided_access): Remove support for
strided LUTI2 and LUTI4.
gcc/testsuite/
* gcc.target/aarch64/sme/strided_1.c (test5): Remove.
|
|
For thumb1, when using a peephole to fuse
mov reg, #const
add reg, reg, SP
into
add reg, SP, #const
we must first check that reg is a low register, otherwise we will ICE
when trying to recognize the resulting insn.
gcc/ChangeLog:
PR target/113510
* config/arm/thumb1.md (peephole2 to fuse mov imm/add SP): Use
low_register_operand.
|
|
gcc.target/arm/pr112337.c was failing to validate that adding MVE options
was compatible with the test environment, so add the missing checks.
gcc/testsuite/ChangeLog:
PR target/112337
* gcc.target/arm/pr112337.c: Check for, then use the right MVE
options.
|
|
gcc/rust/ChangeLog:
* ast/rust-ast-collector.cc (TokenCollector::visit):
Remove dead code.
* ast/rust-ast-collector.h: Likewise.
* ast/rust-ast-full-decls.h (class ExternalFunctionItem):
Likewise.
* ast/rust-ast-visitor.cc (DefaultASTVisitor::visit):
Likewise.
* ast/rust-ast-visitor.h: Likewise.
* ast/rust-ast.cc (ExternalFunctionItem::as_string): Likewise.
(ExternalFunctionItem::accept_vis): Likewise.
* checks/errors/rust-ast-validation.cc (ASTValidation::visit):
Likewise.
* checks/errors/rust-ast-validation.h: Likewise.
* checks/errors/rust-feature-gate.h: Likewise.
* expand/rust-cfg-strip.cc (CfgStrip::visit):
Likewise.
* expand/rust-cfg-strip.h: Likewise.
* expand/rust-derive.h: Likewise.
* expand/rust-expand-visitor.cc (ExpandVisitor::visit):
Likewise.
* expand/rust-expand-visitor.h: Likewise.
* hir/rust-ast-lower-base.cc (ASTLoweringBase::visit):
Likewise.
* hir/rust-ast-lower-base.h: Likewise.
* metadata/rust-export-metadata.cc (ExportContext::emit_function):
Likewise.
* parse/rust-parse-impl.h: Likewise.
* parse/rust-parse.h: Likewise.
* resolve/rust-ast-resolve-base.cc (ResolverBase::visit):
Likewise.
* resolve/rust-ast-resolve-base.h: Likewise.
* resolve/rust-default-resolver.cc (DefaultResolver::visit):
Likewise.
* resolve/rust-default-resolver.h: Likewise.
* util/rust-attributes.cc (AttributeChecker::visit): Likewise.
* util/rust-attributes.h: Likewise.
gcc/testsuite/ChangeLog:
* rust/compile/extern_func_with_body.rs: New test.
Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
|
|
gcc/rust/ChangeLog:
* checks/errors/rust-feature-gate.cc (FeatureGate::visit):
Check if function is_external or not.
* hir/rust-ast-lower-extern.h: Use AST::Function
instead of AST::ExternalFunctionItem.
* parse/rust-parse-impl.h (Parser::parse_external_item):
Likewise.
(Parser::parse_pattern): Fix clang format.
* resolve/rust-ast-resolve-implitem.h: Likewise.
* resolve/rust-ast-resolve-item.cc (ResolveExternItem::visit):
Likewise.
* resolve/rust-ast-resolve-item.h: Likewise.
* resolve/rust-default-resolver.cc (DefaultResolver::visit):
Check if param has_pattern before using get_pattern.
Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
|
|
gcc/rust/ChangeLog:
* checks/errors/rust-ast-validation.cc (ASTValidation::visit):
Add external function validation support. Add ErrorCode::E0130.
* parse/rust-parse-impl.h (Parser::parse_function): Parse
external functions from `parse_function`.
(Parser::parse_external_item): Clang format.
(Parser::parse_pattern): Clang format.
* parse/rust-parse.h: Add default parameter
`is_external` in `parse_function`.
Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
|
|
gcc/rust/ChangeLog:
* ast/rust-ast.h: Add Kind Enum to
Pattern.
* ast/rust-macro.h: Add get_pattern_kind().
* ast/rust-path.h: Likewise.
* ast/rust-pattern.h: Likewise.
Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
|
|
gcc/rust/ChangeLog:
* ast/rust-ast.cc (Function::Function): Add `is_external_function` field.
(Function::operator=): Likewise.
* ast/rust-ast.h: New constructor for ExternalItem.
* ast/rust-item.h (class Function): Add `is_external_function`
field. Update `get_node_id`.
* ast/rust-macro.h: Update copy constructor.
Signed-off-by: 0xn4utilus <gyanendrabanjare8@gmail.com>
|
|
Register alloc may expand a 3-operand arithmetic X = Y o CST as
X = CST
X o= Y
where it may be better to instead:
X = Y
X o= CST
because 1) the first insn may use MOVW for "X = Y", and 2) the
operation may be more efficient when performed with a constant,
for example when ADIW or SBIW can be used, or some bytes of
the constant are 0x00 or 0xff.
gcc/
* config/avr/avr.md: Add two RTL peepholes for PLUS, IOR and AND
in HI, PSI, SI that swap operation order from "X = CST, X o= Y"
to "X = Y, X o= CST".
|
|
Fixes: 08edf85f747b ("c++/modules: relax diagnostic about GMF contents")
gcc/c-family/ChangeLog:
* c.opt.urls: Regenerate.
|
|
The psABI allows using s9 as an alias of r22.
gcc/ChangeLog:
* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/regname-fp-s9.c: New test.
|
|
The instructions printed by insn "*insv.any_shift.<mode>_split" were
sub-optimal. The code to print the improved output is lengthy and
performed by new function avr_out_insv. As it turns out, the function
can also handle shift-offsets of zero, which is "*andhi3", "*andpsi3"
and "*andsi3". Thus, these tree insns get a new 3-operand alternative
where the 3rd operand is an exact power of 2.
gcc/
* config/avr/avr-protos.h (avr_out_insv): New proto.
* config/avr/avr.cc (avr_out_insv): New function.
(avr_adjust_insn_length) [ADJUST_LEN_INSV]: Handle case.
(avr_cbranch_cost) [ZERO_EXTRACT]: Adjust rtx costs.
* config/avr/avr.md (define_attr "adjust_len") Add insv.
(andhi3, *andhi3, andpsi3, *andpsi3, andsi3, *andsi3):
Add constraint alternative where the 3rd operand is a power
of 2, and the source register may differ from the destination.
(*insv.any_shift.<mode>_split): Call avr_out_insv to output
instructions. Set attr "length" to "insv".
* config/avr/constraints.md (Cb2, Cb3, Cb4): New constraints.
gcc/testsuite/
* gcc.target/avr/torture/insv-anyshift-hi.c: New test.
* gcc.target/avr/torture/insv-anyshift-si.c: New test.
|
|
The following makes sure to use recognized patterns when vectorizing
roots during BB SLP discovery. We need to apply those late since
during root discovery we've not yet done pattern recognition.
All parts of the vectorizer assume patterns get used, for the testcase
we mix this up when doing live lane computation.
PR tree-optimization/114231
* tree-vect-slp.cc (vect_analyze_slp): Lookup patterns when
processing a BB SLP root.
* gcc.dg/vect/pr114231.c: New testcase.
|
|
gcc/rust/Changelog:
* expand/rust-expand-visitor.cc
(ExpandVisitor::expand_inner_items): Adjust to use has_value ()
(ExpandVisitor::expand_inner_stmts): Likewise
* expand/rust-macro-builtins.cc (builtin_macro_from_string): Likewise
(make_macro_path_str): Likewise
* util/rust-hir-map.cc (Mappings::insert_macro_def): Likewise
* util/rust-lang-item.cc (LangItem::Parse): Adjust to return tl::optional
(LangItem::toString) Likewise
* util/rust-token-converter.cc (handle_suffix): Adjust to use value.or ()
(from_literal) Likewise
* util/bi-map.h (BiMap::lookup): Adjust to use tl::optional for
lookups
Signed-off-by: Sourabh Jaiswal <sourabhrj31@gmail.com>
|
|
On the following testcase, we have
(insn 10 7 11 2 (set (reg/v:TI 106 [ h ])
(rotate:TI (reg/v:TI 106 [ h ])
(const_int 64 [0x40]))) "pr114211.c":8:5 1042 {rotl64ti2_doubleword}
(nil))
before subreg1 and the pass decides to use
(reg:DI 127 [ h ]) / (reg:DI 128 [ h+8 ])
register pair instead of (reg/v:TI 106 [ h ]).
resolve_operand_for_swap_move_operator implements it by pretending it is
an assignment from
(concatn (reg:DI 127 [ h ]) (reg:DI 128 [ h+8 ]))
to
(concatn (reg:DI 128 [ h+8 ]) (reg:DI 127 [ h ]))
The problem is that if the rotate argument is the same as destination or
if there is even an overlap between the first half of the destination with
second half of the source we emit incorrect code, because the store to
(reg:DI 128 [ h+8 ]) overwrites what we need for source of the second
move. The following patch detects that case and uses a temporary pseudo
to hold the original (reg:DI 128 [ h+8 ]) value across the first store.
2024-03-05 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/114211
* lower-subreg.cc (resolve_simple_move): For double-word
rotates by BITS_PER_WORD if there is overlap between source
and destination use a temporary.
* gcc.dg/pr114211.c: New test.
|
|
The following patch adds support for BIT_FIELD_REF lowering with
large/huge _BitInt lhs. BIT_FIELD_REF requires mode argument first
operand, so the operand shouldn't be any huge _BitInt.
If we only access limbs from inside of BIT_FIELD_REF using constant
indexes, we can just create a new BIT_FIELD_REF to extract the limb,
but if we need to use variable index in a loop, I'm afraid we need
to spill it into memory, which is what the following patch does.
If there is some bitwise type for the extraction, it extracts just
what we need and not more than that, otherwise it spills the whole
first argument of BIT_FIELD_REF and uses MEM_REF with an offset
with VIEW_CONVERT_EXPR around it.
2024-03-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/114157
* gimple-lower-bitint.cc: Include stor-layout.h.
(mergeable_op): Return true for BIT_FIELD_REF.
(struct bitint_large_huge): Declare handle_bit_field_ref method.
(bitint_large_huge::handle_bit_field_ref): New method.
(bitint_large_huge::handle_stmt): Use it for BIT_FIELD_REF.
* gcc.dg/bitint-98.c: New test.
* gcc.target/i386/avx2-pr114157.c: New test.
* gcc.target/i386/avx512f-pr114157.c: New test.
|
|
[PR114116]
As mentioned in the PR, on x86_64 currently a lot of ICEs end up
with crashes in the unwinder like:
during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at internal-fn.cc:208
5 | __builtin_clzg (a);
| ^~~~~~~~~~~~~~~~~~
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208
pr114044-2.c:5:3: internal compiler error: Segmentation fault
0x1554262 crash_signal
../../gcc/toplev.cc:319
0x2b20320 x86_64_fallback_frame_state
./md-unwind-support.h:63
0x2b20320 uw_frame_state_for
../../../libgcc/unwind-dw2.c:1013
0x2b2165d _Unwind_Backtrace
../../../libgcc/unwind.inc:303
0x2acbd69 backtrace_full
../../libbacktrace/backtrace.c:127
0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
../../gcc/diagnostic.cc:781
0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
../../gcc/diagnostic.h:1002
0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
../../gcc/diagnostic.cc:1633
0x2a33543 diagnostic_impl
../../gcc/diagnostic.cc:1767
0x2a33c26 internal_error(char const*, ...)
../../gcc/diagnostic.cc:2225
0xe232c8 fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:2336
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208
Segmentation fault (core dumped)
The problem are the PR38534 r14-8470 changes which avoid saving call-saved
registers in noreturn functions. If such functions ever touch the
bp register but because of the r14-8470 changes don't save it in the
prologue, the caller or any other function in the backtrace uses a frame
pointer and the noreturn function or anything it calls directly or
indirectly calls backtrace, then the unwinder crashes, because bp register
contains some unrelated value, but in the frames which do use frame pointer
CFA is based on the bp register.
In theory this could happen with any other call-saved register, e.g. code
written by hand in assembly with .cfi_* directives could use any other
call-saved register as register into which store the CFA or something
related to that, but in reality at least compiler generated code and usual
assembly probably just making sure bp doesn't contain garbage could be
enough for backtrace purposes. In the debugger of course it will not be
enough, the values of the arguments etc. can be lost (if DW_CFA_undefined
is emitted) or garbage.
So, I think for noreturn function we should at least save the bp register
if we use it. If user asks for it using no_callee_saved_registers
attribute, let's honor what is asked for (but then it is up to the user
to make sure e.g. backtrace isn't called from the function or anything it
calls). As discussed in the PR, whether to save bp or not shouldn't be
based on whether compiling with -g or -g0, because we don't want code
generation changes without/with debugging, it would also break
-fcompare-debug, and users can call backtrace(3), that doesn't use debug
info, just unwind info, even backtrace_symbols{,_fd}(3) don't use debug info
but just looks at dynamic symbol table.
The patch also adds check for no_caller_saved_registers
attribute in the implicit addition of not saving callee saved register
in noreturn functions, because on I think
__attribute__((no_caller_saved_registers, noreturn)) will otherwise
error that no_caller_saved_registers and no_callee_saved_registers
attributes are incompatible (but user didn't specify anything like that).
2024-03-05 Jakub Jelinek <jakub@redhat.com>
PR target/114116
* config/i386/i386.h (enum call_saved_registers_type): Add
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP enumerator.
* config/i386/i386-options.cc (ix86_set_func_type): Remove
has_no_callee_saved_registers variable, add no_callee_saved_registers
instead, initialize it depending on whether it is
no_callee_saved_registers function or not. Don't set it if
no_caller_saved_registers attribute is present. Adjust users.
* config/i386/i386.cc (ix86_function_ok_for_sibcall): Handle
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP like
TYPE_NO_CALLEE_SAVED_REGISTERS.
(ix86_save_reg): Handle TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* gcc.target/i386/pr38534-1.c: Allow push/pop of bp.
* gcc.target/i386/pr38534-4.c: Likewise.
* gcc.target/i386/pr38534-2.c: Likewise.
* gcc.target/i386/pr38534-3.c: Likewise.
* gcc.target/i386/pr114097-1.c: Likewise.
* gcc.target/i386/stack-check-17.c: Expect no pop on ! ia32.
|
|
Cleanup mode_size related code which is not used anymore. Below tests are
passed for this patch.
* The RVV fully regresssion test.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused
mode_size related code.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Issuing a hard error when the GMF doesn't consist only of preprocessing
directives happens to be inconvenient for automated testcase reduction
via cvise. This patch relaxes this diagnostic into a pedwarn that can
be disabled with -Wno-global-module.
gcc/c-family/ChangeLog:
* c.opt (Wglobal-module): New warning.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_translation_unit): Relax GMF contents
error into a pedwarn.
gcc/ChangeLog:
* doc/invoke.texi (-Wno-global-module): Document.
gcc/testsuite/ChangeLog:
* g++.dg/modules/friend-6_a.C: Pass -Wno-global-module instead
of -Wno-pedantic. Remove now unnecessary preprocessing
directives from GMF.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
|
|
Currently a using-declaration bringing a name into its own namespace is
a no-op, except for functions. This prevents people from being able to
redeclare a name brought in from the GMF as exported, however, which
this patch fixes.
Apart from marking declarations as exported they are also now marked as
effectively being in the module purview (due to the using-decl) so that
they are properly processed, as 'add_binding_entity' assumes that
declarations not in the module purview cannot possibly be exported.
gcc/cp/ChangeLog:
* name-lookup.cc (walk_module_binding): Remove completed FIXME.
(do_nonmember_using_decl): Mark redeclared entities as exported
when needed. Check for re-exporting internal linkage types.
gcc/testsuite/ChangeLog:
* g++.dg/modules/using-12.C: New test.
* g++.dg/modules/using-13.h: New test.
* g++.dg/modules/using-13_a.C: New test.
* g++.dg/modules/using-13_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|