Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
This patch fixes an oversight in the handling of debug instructions
in rtl-ssa. At the moment (and whether this is a good idea or not
remains to be seen), we maintain a linear RPO sequence of definitions
and non-debug uses. If a register is defined more than once, we use
a degenerate phi to reestablish a previous definition where necessary.
However, debug instructions shouldn't of course affect codegen,
so we can't create a new definition just for them. In those situations
we instead hang the debug use off the real definition (meaning that
debug uses do not follow a linear order wrt definitions). Again,
it remains to be seen whether that's a good idea.
The problem in the PR was that we weren't taking this into account
when increasing (or potentially increasing) the live range of an
existing definition. We'd create the phi even if it would only
be used by debug instructions.
The patch goes for the simple but inelegant approach of passing
a bool to say whether the use is a debug use or not. I imagine
this area will need some tweaking based on experience in future.
gcc/
PR rtl-optimization/100303
* rtl-ssa/accesses.cc (function_info::make_use_available): Take a
boolean that indicates whether the use will only be used in
debug instructions. Treat it in the same way that existing
cross-EBB debug references would be handled if so.
(function_info::make_uses_available): Likewise.
* rtl-ssa/functions.h (function_info::make_uses_available): Update
prototype accordingly.
(function_info::make_uses_available): Likewise.
* fwprop.c (try_fwprop_subst): Update call accordingly.
(cherry picked from commit c97351c0cf4872cc0e99e73ed17fb16659fd38b3)
|
|
insn_info tried to save space by storing the number of
definitions in a 16-bit bitfield. The justification was:
// ... FIRST_PSEUDO_REGISTER + 1
// is the maximum number of accesses to hard registers and memory, and
// MAX_RECOG_OPERANDS is the maximum number of pseudos that can be
// defined by an instruction, so the number of definitions should fit
// easily in 16 bits.
But while that reasoning holds (I think) for real instructions,
it doesn't hold for artificial instructions. I don't think there's
any sensible higher limit we can use, so this patch goes for a full
unsigned int.
gcc/
PR rtl-optimization/108086
* rtl-ssa/insns.h (insn_info): Make m_num_defs a full unsigned int.
Adjust size-related commentary accordingly.
(cherry picked from commit cd41085a37b8288dbdfe0f81027ce04b978578f1)
|
|
This was another PR caused by the way that
vect_determine_precisions_from_range handles shifts. We tried to
narrow 32768 >> x to a 16-bit shift based on range information for
the inputs and outputs, with vect_recog_over_widening_pattern
(after PR110828) adjusting the shift amount. But this doesn't
work for the case where x is in [16, 31], since then 32-bit
32768 >> x is a well-defined zero, whereas no well-defined
16-bit 32768 >> y will produce 0.
We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
but since vect_determine_precisions_from_range was never really
supposed to rely on fix-ups, it seems better to fix that instead.
The patch also makes the code more selective about which codes
can be narrowed based on input and output ranges. This showed
that vect_truncatable_operation_p was missing cases for
BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
(equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).
pr113281-1.c is the original testcase. pr113281-[23].c failed
before the patch due to overly optimistic narrowing. pr113281-[45].c
previously passed and are meant to protect against accidental
optimisation regressions.
gcc/
PR target/113281
* tree-vect-patterns.c (vect_recog_over_widening_pattern): Remove
workaround for right shifts.
(vect_truncatable_operation_p): Handle NEGATE_EXPR and BIT_NOT_EXPR.
(vect_determine_precisions_from_range): Be more selective about
which codes can be narrowed based on their input and output ranges.
For shifts, require at least one more bit of precision than the
maximum shift amount.
gcc/testsuite/
PR target/113281
* gcc.dg/vect/pr113281-1.c: New test.
* gcc.dg/vect/pr113281-2.c: Likewise.
* gcc.dg/vect/pr113281-3.c: Likewise.
* gcc.dg/vect/pr113281-4.c: Likewise.
* gcc.dg/vect/pr113281-5.c: Likewise.
(cherry picked from commit 1a8261e047f7a2c2b0afb95716f7615cba718cd1)
|
|
create_intersect_range_checks checks whether two access ranges
a and b are alias-free using something equivalent to:
end_a <= start_b || end_b <= start_a
It has two ways of doing this: a "vanilla" way that calculates
the exact exclusive end pointers, and another way that uses the
last inclusive aligned pointers (and changes the comparisons
accordingly). The comment for the latter is:
/* Calculate the minimum alignment shared by all four pointers,
then arrange for this alignment to be subtracted from the
exclusive maximum values to get inclusive maximum values.
This "- min_align" is cumulative with a "+ access_size"
in the calculation of the maximum values. In the best
(and common) case, the two cancel each other out, leaving
us with an inclusive bound based only on seg_len. In the
worst case we're simply adding a smaller number than before.
The problem is that the associated code implicitly assumed that the
access size was a multiple of the pointer alignment, and so the
alignment could be carried over to the exclusive end pointer.
The testcase started failing after g:9fa5b473b5b8e289b6542
because that commit improved the alignment information for
the accesses.
gcc/
PR tree-optimization/115192
* tree-data-ref.c (create_intersect_range_checks): Take the
alignment of the access sizes into account.
gcc/testsuite/
PR tree-optimization/115192
* gcc.dg/vect/pr115192.c: New test.
(cherry picked from commit a0fe4fb1c8d7804515845dd5d2a814b3c7a1ccba)
|
|
|
|
any_divmod instructions are modelled with invalid RTX:
[(set (match_operand:DI 0 "register_operand" "=c")
(sign_extend:DI (match_operator:SI 3 "divmod_operator"
[(match_operand:DI 1 "register_operand" "a")
(match_operand:DI 2 "register_operand" "b")])))
(clobber (reg:DI 23))
(clobber (reg:DI 28))]
where SImode divmod_operator (div,mod,udiv,umod) has DImode operands.
Wrap input operand with truncate:SI to make machine modes consistent.
PR target/115297
gcc/ChangeLog:
* config/alpha/alpha.md (<any_divmod:code>si3): Wrap DImode
operands 3 and 4 with truncate:SI RTX.
(*divmodsi_internal_er): Ditto for operands 1 and 2.
(*divmodsi_internal_er_1): Ditto.
(*divmodsi_internal): Ditto.
* config/alpha/constraints.md ("b"): Correct register
number in the description.
gcc/testsuite/ChangeLog:
* gcc.target/alpha/pr115297.c: New test.
(cherry picked from commit 0ac802064c2a018cf166c37841697e867de65a95)
|
|
|
|
|
|
|
|
|
|
PR Target/84790.
The gp init sequence
li $2,%hi(_gp_disp)
addiu $3,$pc,%lo(_gp_disp)
sll $2,16
addu $2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.
So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.
Let's mark $2/$3 clobber both:
- Just after the UNSPEC_GP RTL of a function;
- Just after a function call.
Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Origin-Patch-by: Felix Fietkau <nbd@nbd.name>.
gcc
* config/mips/mips.c(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.
(cherry picked from commit 915440eed21de367cb41857afb5273aff5bcb737)
|
|
|
|
|
|
sanitization [PR115172]
The following testcase is miscompiled, because -fsanitize=bool,enum
creates a MEM_REF without propagating there address space qualifiers,
so what should be normally loaded using say %gs:/%fs: segment prefix
isn't. Together with asan it then causes that load to be sanitized.
2024-05-22 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/115172
* ubsan.c (instrument_bool_enum_load): If rhs is not in generic
address space, use qualified version of utype with the right
address space. Formatting fix.
* gcc.dg/asan/pr115172.c: New test.
(cherry picked from commit d3c506eff54fcbac389a529c2e98da108a410b7f)
|
|
|
|
|
|
|
|
|
|
Here AGGREGATE_TYPE_P includes pointers to member functions, which is not
what we want. Instead we should use class||array, as elsewhere in the
function.
PR c++/113598
gcc/cp/ChangeLog:
* init.c (build_vec_init): Don't use {} for PMF.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-pmf2.C: New test.
(cherry picked from commit 136a828754ff65079a834555582b49d54bd5bc64)
|
|
We were failing to handle ANNOTATE_EXPR in tsubst_copy_and_build, leading to
problems with substitution of any wrapped expressions.
Let's also not tell users that lambda templates are available in C++14.
PR c++/111529
gcc/cp/ChangeLog:
* parser.c (cp_parser_lambda_declarator_opt): Don't suggest
-std=c++14 for lambda templates.
* pt.c (tsubst_expr): Move ANNOTATE_EXPR handling...
(tsubst_copy_and_build): ...here.
gcc/testsuite/ChangeLog:
* g++.dg/ext/unroll-4.C: New test.
(cherry picked from commit 9c62af101e11e1cce573c2b3d2e18b403412dbc8)
|
|
The requirement that a type argument be complete is excessive in the case of
direct reference binding to the same type, which does not rely on any
properties of the type. This is LWG 2939.
PR c++/100667
gcc/cp/ChangeLog:
* semantics.c (same_type_ref_bind_p): New.
(finish_trait_expr): Use it.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_constructible8.C: New test.
(cherry picked from commit 8bb3ef3f6e335e8794590fb712a2661d11d21973)
|
|
We represent a reference binding where the referent type is more qualified
by a ck_ref_bind around a ck_qual. We performed the ck_qual and then tried
to undo it with STRIP_NOPS, but that doesn't work if the conversion is
buried in COMPOUND_EXPR. So instead let's avoid performing that fake
conversion in the first place.
PR c++/114561
PR c++/114562
gcc/cp/ChangeLog:
* call.c (convert_like_internal): Avoid adding qualification
conversion in direct reference binding.
gcc/testsuite/ChangeLog:
* g++.dg/conversion/ref10.C: New test.
* g++.dg/conversion/ref11.C: New test.
(cherry picked from commit 5d7e9a35024f065b25f61747859c7cb7a770c92b)
|
|
|
|
|
|
Add regression test to the existing zero/sign extend tests for CMSE to
verify that r0, r1, r2 and r3 are properly extended, not just r0.
boolCharShortEnumSecureFunc test is done using -O0 to ensure the
instructions are in a predictable order.
gcc/testsuite/ChangeLog:
* gcc.target/arm/cmse/extend-param.c: Add regression test. Add
-fshort-enums.
* gcc.target/arm/cmse/extend-return.c: Add -fshort-enums option.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 9ddad76e98ac8f257f90b3814ed3c6ba78d0f3c7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There have been several changes in the ABI of Objective-C which
depend on the OS version targetted. In this case Protocols and
LabelProtocols should be made weak/hidden/extern from macOS 10.7
however there was a mistake in the code causing this to occur
from macOS 10.6. Fixed thus.
gcc/objc/ChangeLog:
* objc-next-runtime-abi-02.c (WEAK_PROTOCOLS_AFTER): New.
(next_runtime_abi_02_protocol_decl): Use WEAK_PROTOCOLS_AFTER
to determine this ABI change.
(build_v2_protocol_list_address_table): Likewise.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 9b5c0be59d0f94df0517820f00b4520b5abddd8c)
|
|
|
|
The test FAILs on i686-linux due to
.../gcc/testsuite/g++.dg/torture/vector-subaccess-1.C:16:6: warning: SSE vector argument without SSE enabled changes the ABI [-Wpsabi]
excess warnings.
This fixes it by adding -Wno-psabi, like commonly done in other tests.
2024-05-09 Jakub Jelinek <jakub@redhat.com>
PR c++/89224
* g++.dg/torture/vector-subaccess-1.C: Add -Wno-psabi as additional
options.
(cherry picked from commit 8fb65ec816ff8f0d529b6d30821abace4328c9a2)
|
|
The issue here is that when backprop tries to go
and strip sign ops, it skips over ABSU_EXPR but
ABSU_EXPR not only does an ABS, it also changes the
type to unsigned.
Since strip_sign_op_1 is only supposed to strip off
sign changing operands and not ones that change types,
removing ABSU_EXPR here is correct. We don't handle
nop conversions so this does cause any missed optimizations either.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/110386
gcc/ChangeLog:
* gimple-ssa-backprop.c (strip_sign_op_1): Remove ABSU_EXPR.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/pr110386-1.c: New test.
* gcc.c-torture/compile/pr110386-2.c: New test.
(cherry picked from commit 2bbac12ea7bd8a3eef5382e1b13f6019df4ec03f)
|
|
The problem here is after r6-7425-ga9fee7cdc3c62d0e51730,
the comparison to see if the transformation could be done was using the
wrong value. Instead of see if the inner was LE (for MIN and GE for MAX)
the outer value, it was comparing the inner to the value used in the comparison
which was wrong.
Committed to GCC 13 branch after bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
PR tree-optimization/111331
* tree-ssa-phiopt.c (minmax_replacement):
Fix the LE/GE comparison for the
`(a CMP CST1) ? max<a,CST2> : a` optimization.
gcc/testsuite/ChangeLog:
PR tree-optimization/111331
* gcc.c-torture/execute/pr111331-1.c: New test.
* gcc.c-torture/execute/pr111331-2.c: New test.
* gcc.c-torture/execute/pr111331-3.c: New test.
(cherry picked from commit 30e6ee074588bacefd2dfe745b188bb20c81fe5e)
|
|
The problem here is that merge_truthop_with_opposite_arm would
use the type of the result of the comparison rather than the operands
of the comparison to figure out if we are honoring NaNs.
This fixes that oversight and now we get the correct results in this
case.
Committed as obvious after a bootstrap/test on x86_64-linux-gnu.
PR middle-end/95351
gcc/ChangeLog:
* fold-const.c (merge_truthop_with_opposite_arm): Use
the type of the operands of the comparison and not the type
of the comparison.
gcc/testsuite/ChangeLog:
* gcc.dg/float_opposite_arm-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 31ce2e993d09dcad1ce139a2848a28de5931056d)
|
|
types [PR89224]
After r7-987-gf17a223de829cb, the access for the elements of a vector type would lose the qualifiers.
So if we had `constvector[0]`, the type of the element of the array would not have const on it.
This was due to a missing build_qualified_type for the inner type of the vector when building the array type.
We need to add back the call to build_qualified_type and now the access has the correct qualifiers. So the
overloads and even if it is a lvalue or rvalue is correctly done.
Note we correctly now reject the testcase gcc.dg/pr83415.c which was incorrectly accepted after r7-987-gf17a223de829cb.
Built and tested for aarch64-linux-gnu.
PR c++/89224
gcc/c-family/ChangeLog:
* c-common.c (convert_vector_to_array_for_subscript): Call build_qualified_type
for the inner type.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_array_reference): Compare main variants
for the vector/array types instead of the types directly.
gcc/testsuite/ChangeLog:
* g++.dg/torture/vector-subaccess-1.C: New test.
* gcc.dg/pr83415.c: Change warning to error.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
(cherry picked from commit 4421d35167b3083e0f2e4c84c91fded09a30cf22)
|
|
|
|
|
|
|
|
|
|
|