Age | Commit message (Collapse) | Author | Files | Lines |
|
If a frame has no saved registers, it can be allocated in one go.
There is no need to treat the areas below and above the saved
registers as separate.
And if we allocate the frame in one go, it should be allocated
as the initial_adjust rather than the final_adjust. This allows the
frame size to grow to guard_size - guard_used_by_caller before a stack
probe is needed. (A frame with no register saves is necessarily a
leaf frame.)
This is a no-op as thing stand, since a leaf function will have
no outgoing arguments, and so all the frame will be above where
the saved registers normally go.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Explicitly
allocate the frame in one go if there are no saved registers.
|
|
When we emit the frame chain, i.e. when we reach Here in this statement
of aarch64_expand_prologue:
if (emit_frame_chain)
{
// Here
...
}
the stack is in one of two states:
- We've allocated up to the frame chain, but no more.
- We've allocated the whole frame, and the frame chain is within easy
reach of the new SP.
The offset of the frame chain from the current SP is available
in aarch64_frame as callee_offset. It is also available as the
chain_offset local variable, where the latter is calculated from other
data. (However, chain_offset is not always equal to callee_offset when
!emit_frame_chain, so chain_offset isn't redundant.)
In c600df9a4060da3c6121ff4d0b93f179eafd69d1 I switched to using
chain_offset for the initialisation of the hard frame pointer:
aarch64_add_offset (Pmode, hard_frame_pointer_rtx,
- stack_pointer_rtx, callee_offset,
+ stack_pointer_rtx, chain_offset,
tmp1_rtx, tmp0_rtx, frame_pointer_needed);
But the later REG_CFA_ADJUST_CFA handling still used callee_offset.
I think the difference is harmless, but it's more logical for the
CFA note to be in sync, and it's more convenient for later patches
if it uses chain_offset.
gcc/
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Use
chain_offset rather than callee_offset.
|
|
aarch64_layout_frame uses a shorthand for referring to
cfun->machine->frame:
aarch64_frame &frame = cfun->machine->frame;
This patch does the same for some other heavy users of the structure.
No functional change intended.
gcc/
* config/aarch64/aarch64.cc (aarch64_save_callee_saves): Use
a local shorthand for cfun->machine->frame.
(aarch64_restore_callee_saves, aarch64_get_separate_components):
(aarch64_process_components): Likewise.
(aarch64_allocate_and_probe_stack_space): Likewise.
(aarch64_expand_prologue, aarch64_expand_epilogue): Likewise.
(aarch64_layout_frame): Use existing shorthand for one more case.
|
|
This patch introduces -Wcase-enum which enumerates each missing
field in a case statement without an else clause providing the selector
expression type is an enum.
gcc/ChangeLog:
* doc/gm2.texi (Compiler options): Document new option
-Wcase-enum.
gcc/m2/ChangeLog:
* gm2-compiler/M2CaseList.def (PushCase): Rename parameters
r to rec and v to va. Add expr parameter.
(MissingCaseStatementBounds): New procedure function.
* gm2-compiler/M2CaseList.mod (RangePair): Add expression.
(PushCase): Rename parameters r to rec and v to va. Add
expr parameter.
(RemoveRange): New procedure function.
(SubBitRange): Detect the case when the range in the set matches
lo..hi.
(CheckLowHigh): New procedure.
(ExcludeCaseRanges): Rename parameter c to cd. Rename local
variables q to cl and r to rp.
(High): Remove.
(Low): Remove.
(DoEnumValues): Remove.
(IncludeElement): New procedure.
(IncludeElements): New procedure.
(ErrorRangeEnum): New procedure.
(ErrorRange): Remove.
(ErrorRanges): Remove.
(appendEnum): New procedure.
(appendStr): New procedure.
(EnumerateErrors): New procedure.
(MissingCaseBounds): Re-implement.
(InRangeList): Remove.
(MissingCaseStatementBounds): New procedure function.
(checkTypes): Re-format.
(inRange): Re-format.
(TypeCaseBounds): Re-format.
* gm2-compiler/M2Error.mod (GetAnnounceScope): Add noscope to
case label list.
* gm2-compiler/M2GCCDeclare.mod: Replace ForeachFieldEnumerationDo
with ForeachLocalSymDo.
* gm2-compiler/M2Options.def (SetCaseEnumChecking): New procedure.
(CaseEnumChecking): New variable.
* gm2-compiler/M2Options.mod (SetCaseEnumChecking): New procedure.
(Module initialization): set CaseEnumChecking to FALSE.
* gm2-compiler/M2Quads.def (QuadOperator): Alphabetically ordered.
* gm2-compiler/M2Quads.mod (IsBackReferenceConditional): Add else
clause.
(BuildCaseStart): Pass selector expression to InitCaseBounds.
(CheckUninitializedVariablesAreUsed): Remove.
(IsInlineWithinBlock): Remove.
(AsmStatementsInBlock): Remove.
(CheckVariablesInBlock): Remove commented code.
(BeginVarient): Pass NulSym to InitCaseBounds.
* gm2-compiler/M2Range.mod (FoldCaseBounds): New local variable
errorGenerated. Add call to MissingCaseStatementBounds.
* gm2-compiler/P3Build.bnf (CaseEndStatement): Call ElseCase.
* gm2-compiler/PCSymBuild.mod (InitDesExpr): Add else clause.
(InitFunction): Add else clause.
(InitConvert): Add else clause.
(InitLeaf): Add else clause.
(InitBinary): Add else clause.
(InitUnary): Add else clause.
* gm2-compiler/SymbolTable.def (GetNth): Re-write comment.
(ForeachFieldEnumerationDo): Re-write comment stating alphabetical
traversal.
* gm2-compiler/SymbolTable.mod (GetNth): Re-write comment.
Add case label for EnumerationSym and call GetItemFromList.
(ForeachFieldEnumerationDo): Re-write comment stating alphabetical
traversal.
(SymEnumeration): Add ListOfFields used for declaration order.
(MakeEnumeration): Initialize ListOfFields.
(PutFieldEnumeration): Include Field in ListOfFields.
* gm2-gcc/m2options.h (M2Options_SetCaseEnumChecking): New
function.
* gm2-lang.cc (gm2_langhook_handle_option): Add
OPT_Wcase_enum case and call M2Options_SetCaseEnumChecking.
* lang.opt (Wcase-enum): Add.
gcc/testsuite/ChangeLog:
* gm2/switches/case/fail/missingclause.mod: New test.
* gm2/switches/case/fail/switches-case-fail.exp: New test.
* gm2/switches/case/pass/enumcase.mod: New test.
* gm2/switches/case/pass/enumcase2.mod: New test.
* gm2/switches/case/pass/switches-case-pass.exp: New test.
(cherry picked from commit 89b5866742a17c38cc98edd9e434cff8e3a3c7ea)
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Walk expression tree of the 'allocator' clause of 'omp allocate' to
detect more cases where the allocator expression depends on code between
a variable declaration and its associated '#pragma omp allocate'. It also
contains the fix for the 'allocator((omp_allocator_handle_t)-1)' ICE, also
tested for in previous commit.
The changes of this commit were supposed to be part of
r14-3863-g35f498d8dfc8e579eaba2ff2d2b96769c632fd58
OpenMP (C only): omp allocate - extend parsing support, improve diagnostic
which also contains the associated testcase changes but were left out (oops!).
gcc/c/ChangeLog:
* c-parser.cc (struct c_omp_loc_tree): New.
(c_check_omp_allocate_allocator_r): New; checking moved from ...
(c_parser_omp_allocate): ... here. Call it via walk_tree. Avoid
ICE with tree_to_shwi for invalid too-large value.
(cherry picked from commit 27144cc05c4e12f58998b6e30d23098664dd51db)
|
|
The 'allocate' directive can be used for both stack and static variables.
While the parser in C and C++ was pre-existing, it missed several
diagnostics, which this commit adds - for now only for C.
While the "sorry, unimplemented" for static variables is still issues
during parsing, the sorry for stack variables is now issued in the
middle end, preparing for the actual implementation. (Again: only for C.)
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_construct): Move call to
c_parser_omp_allocate to ...
(c_parser_pragma): ... here.
(c_parser_omp_allocate): Avoid ICE is allocator could not be
parsed; set 'omp allocate' attribute for stack/automatic variables
and only reject static variables; add several additional
restriction checks.
* c-tree.h (c_mark_decl_jump_unsafe_in_current_scope): New prototype.
* c-decl.cc (decl_jump_unsafe): Return true for omp-allocated decls.
(c_mark_decl_jump_unsafe_in_current_scope): New.
(warn_about_goto, c_check_switch_jump_warnings): Add error for
omp-allocated decls.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Check for
insertion after variable cleanup. Convert 'omp allocate'
var-decl attribute to GOMP_alloc/GOMP_free calls.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-5.c: Fix testcase; make some
dg-messages for 'sorry' as c++, only.
* c-c++-common/gomp/directive-1.c: Make a 'sorry' c++ only.
* c-c++-common/gomp/allocate-9.c: New test.
* c-c++-common/gomp/allocate-11.c: New test.
* c-c++-common/gomp/allocate-12.c: New test.
* c-c++-common/gomp/allocate-14.c: New test.
* c-c++-common/gomp/allocate-15.c: New test.
* c-c++-common/gomp/allocate-16.c: New test.
(cherry picked from commit 35f498d8dfc8e579eaba2ff2d2b96769c632fd58)
|
|
gcc/
PR target/96762
* config/rs6000/rs6000-string.cc (expand_block_move): Call vector
load/store with length only on 64-bit Power10.
gcc/testsuite/
PR target/96762
* gcc.target/powerpc/pr96762.c: New.
(cherry picked from commit 946b8967b905257ac9f140225db744c9a6ab91be)
|
|
|
|
PR target/111340
gcc/ChangeLog:
* config/i386/i386.cc (output_pic_addr_const): Handle CONST_WIDE_INT.
Call output_addr_const for CASE_CONST_SCALAR_INT.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr111340.c: New test.
|
|
cc1gm2 issues a runtime case statement error and terminates
when building SeqFile.lo on Fedora mock. There are four
missing labels from the largest case statement in M2SymInit.mod.
This patch adds the case labels and appropriate actions.
gcc/m2/ChangeLog:
PR modula2/111330
* gm2-compiler/M2SymInit.mod (CheckReadBeforeInitQuad): Add
case labels LogicalDiffOp, DummyOp, OptParamOp and
InitAddressOp.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
not commutative.
gcc/ChangeLog:
PR target/111306
PR target/111335
* config/i386/sse.md (int_comm): New int_attr.
(fma_<complexopname>_<mode><sdc_maskz_name><round_name>):
Remove % for Complex conjugate operations since they're not
commutative.
(fma_<complexpairopname>_<mode>_pair): Ditto.
(<avx512>_<complexopname>_<mode>_mask<round_name>): Ditto.
(cmul<conj_op><mode>3): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr111306.c: New test.
(cherry picked from commit f197392a16ffb1327f1d12ff8ff05f9295e015cb)
|
|
|
|
|
|
|
|
Merge up to r13-7780-g65e2ddf4f33b5bf42021bc8b2e37e1eecd43e152 (8th Sep 2023)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Which may result in implicit references to $fp when frame_pointer_needed is false,
causing regs_ever_live[$fp] to be true when $fp is not explicitly used,
resulting in $fp being used as the target replacement register in the rnreg pass.
The bug originates from SPEC2017 541.leela_r(-flto).
gcc/ChangeLog:
PR target/110484
* config/loongarch/loongarch.cc (loongarch_emit_stack_tie): Use the
frame_pointer_needed to determine whether to use the $fp register.
Co-authored-by: Guo Jie <guojie@loongson.cn>
(cherry picked from commit 1967f21d000e09d3d3190317af7923b578ce02b1)
|
|
|
|
The earlier patch was only an incremental step toward making this sort of
code work, and broke code that had been working. So let's revert it.
This reverts commit r13-4035-gc41bbfcaf9d6ef.
PR c++/109751
gcc/cp/ChangeLog:
* pt.cc (tsubst_friend_function): Don't check constraints.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-friend11.C: Xfail.
* g++.dg/cpp2a/concepts-friend15.C: New test.
|
|
Here our named return value optimization was breaking the required
destructor when the goto takes 'a' out of scope. A simple fix for the
release branches is to disable the optimization in the presence of backward
goto.
We could do better by disabling the optimization only if there is a backward
goto across the variable declaration, but we don't track that, and in GCC 14
we instead make the goto work with NRV.
PR c++/92407
gcc/cp/ChangeLog:
* cp-tree.h (struct language_function): Add backward_goto.
* decl.cc (check_goto): Set it.
* typeck.cc (check_return_expr): Prevent NRV if set.
gcc/testsuite/ChangeLog:
* g++.dg/opt/nrv22.C: New test.
|
|
|
|
The following testcase is miscompiled since r279392 aka r10-5451-gef29b12cfbb4979
The strlen pass has adjust_last_stmt function, which performs mainly strcat
or strcat-like optimizations (say strcpy (x, "abcd"); strcat (x, p);
or equivalent memcpy (x, "abcd", strlen ("abcd") + 1); char *q = strchr (x, 0);
memcpy (x, p, strlen (p)); etc. where the first stmt stores '\0' character
at the end but next immediately overwrites it and so the first memcpy can be
adjusted to store 1 fewer bytes. handle_builtin_memcpy called this function
in two spots, the first one guarded like:
if (olddsi != NULL
&& tree_fits_uhwi_p (len)
&& !integer_zerop (len))
adjust_last_stmt (olddsi, stmt, false);
i.e. only for constant non-zero length. The other spot can call it even
for non-constant length but in that case we punt before that if that length
isn't length of some string + 1, so again non-zero.
The r279392 change I assume wanted to add some warning stuff and changed it
like
if (olddsi != NULL
- && tree_fits_uhwi_p (len)
&& !integer_zerop (len))
- adjust_last_stmt (olddsi, stmt, false);
+ {
+ maybe_warn_overflow (stmt, len, rvals, olddsi, false, true);
+ adjust_last_stmt (olddsi, stmt, false);
+ }
While maybe_warn_overflow possibly handles non-constant length fine,
adjust_last_stmt really relies on length to be non-zero, which
!integer_zerop (len) alone doesn't guarantee. While we could for
len being SSA_NAME ask the ranger or tree_expr_nonzero_p, I think
adjust_last_stmt will not benefit from it much, so the following patch
just restores the above condition/previous behavior for the adjust_last_stmt
call only.
2023-08-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/110914
* tree-ssa-strlen.cc (strlen_pass::handle_builtin_memcpy): Don't call
adjust_last_stmt unless len is known constant.
* gcc.c-torture/execute/pr110914.c: New test.
(cherry picked from commit 398842e7038ea0f34054f0f694014d0ecd656846)
|
|
The following testcase shows that we mishandle bit insertion for
info->bitsize >= 64. The problem is in using unsigned HOST_WIDE_INT
shift + subtraction + build_int_cst to compute mask, the shift invokes
UB at compile time for info->bitsize 64 and larger and e.g. on the testcase
with info->bitsize happens to compute mask of 0x3f rather than
0x3f'ffffffff'ffffffff.
The patch fixes that by using wide_int wi::mask + wide_int_to_tree, so it
handles masks in any precision (up to WIDE_INT_MAX_PRECISION ;) ).
2023-08-30 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/111015
* gimple-ssa-store-merging.cc
(imm_store_chain_info::output_merged_store): Use wi::mask and
wide_int_to_tree instead of unsigned HOST_WIDE_INT shift and
build_int_cst to build BIT_AND_EXPR mask.
* gcc.dg/pr111015.c: New test.
(cherry picked from commit 49a3b35c4068091900b657cd36e5cffd41ef0c47)
|
|
During libgcc configure stage for riscv32-none-elf, when
"--enable-checking=yes,rtl" has been activated, the following error
is observed:
during RTL pass: final
conftest.c: In function 'main':
conftest.c:16:1: internal compiler error: RTL check: expected code 'const_int', have 'reg' in riscv_print_operand, at config/riscv/riscv.cc:4462
16 | }
| ^
0x843c4d rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*)
/mnt/nvme/dinux/local-workspace/gcc/gcc/rtl.cc:916
0x8ea823 riscv_print_operand
/mnt/nvme/dinux/local-workspace/gcc/gcc/config/riscv/riscv.cc:4462
0xde84b5 output_operand(rtx_def*, int)
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3632
0xde8ef8 output_asm_insn(char const*, rtx_def**)
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3544
0xded33b output_asm_insn(char const*, rtx_def**)
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:3421
0xded33b final_scan_insn_1
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:2841
0xded6cb final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:2887
0xded8b7 final_1
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:1979
0xdee518 rest_of_handle_final
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:4240
0xdee518 execute
/mnt/nvme/dinux/local-workspace/gcc/gcc/final.cc:4318
Fix by moving the calculation of memmodel to the cases where it is used.
Regression tested for riscv32-none-elf. No changes in gcc.sum and
g++.sum.
PR target/109725
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_print_operand): Calculate
memmodel only when it is valid.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
|
|
|
|
REG_P(operand[1]) in -O0.
This issue happens is because the operand1 of scalar move can be
REG_P (operand[1]) in the O0 case, which causes the VSETVL PASS to
not insert the vsetvl instruction correctly, and the compiler crashes.
Consider this following case:
int16_t foo1 (void *base, size_t vl)
{
int16_t maxVal = __riscv_vmv_x_s_i16m1_i16 (__riscv_vle16_v_i16m1 (base, vl));
return maxVal;
}
Before this patch:
bug.c:15:1: internal compiler error: Segmentation fault
15 | }
| ^
0x145d723 crash_signal
../.././riscv-gcc/gcc/toplev.cc:314
0x22929dd const_csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:44
0x2292a21 csr_operand(rtx_def*, machine_mode)
../.././riscv-gcc/gcc/config/riscv/predicates.md:46
0x23dfbb0 recog_356
../.././riscv-gcc/gcc/config/riscv/iterators.md:72
0x23efecd recog(rtx_def*, rtx_insn*, int*)
../.././riscv-gcc/gcc/config/riscv/iterators.md:89
0xdddc15 recog_memoized(rtx_insn*)
../.././riscv-gcc/gcc/recog.h:273
After this patch:
vsetivli zero,0,e16,m1,ta,ma
vmv.x.s a5,v1
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): For vfmv.f.s/vmv.x.s
intruction replace null avl with (const_int 0).
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/scalar_move-10.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-11.c: New test.
|
|
|
|
2023-08-27 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/87477
* parse.cc (parse_associate): Replace the existing evaluation
of the target rank with calls to gfc_resolve_ref and
gfc_expression_rank. Identify untyped target function results
with structure constructors by finding the appropriate derived
type.
* resolve.cc (resolve_symbol): Allow associate variables to be
assumed shape.
gcc/testsuite/
PR fortran/87477
* gfortran.dg/associate_54.f90 : Cope with extra error.
PR fortran/102109
* gfortran.dg/pr102109.f90 : New test.
PR fortran/102112
* gfortran.dg/pr102112.f90 : New test.
PR fortran/102190
* gfortran.dg/pr102190.f90 : New test.
PR fortran/102532
* gfortran.dg/pr102532.f90 : New test.
PR fortran/109948
* gfortran.dg/pr109948.f90 : New test.
PR fortran/99326
* gfortran.dg/pr99326.f90 : New test.
|
|
|
|
|
|
Merge up to r13-7758-gbb791011b39813bc7b6fdd0d9831247ace199615 (25th Aug 2023)
|
|
Correct the parameter order for avx512ne2ps2bf16_maskz expander
gcc/ChangeLog:
PR target/111127
* config/i386/sse.md (avx512f_cvtne2ps2bf16_<mode>_maskz):
Adjust paramter order.
gcc/testsuite/ChangeLog:
PR target/111127
* gcc.target/i386/pr111127.c: New test.
(cherry picked from commit e62fe74e5af913079ba296c74759cd74c0759e8e)
|
|
|
|
Before commit r12-5295-g47de0b56ee455e, all gimple_build_cond in
expand_omp_for_* were inserted with
gsi_insert_before (gsi_p, cond_stmt, GSI_SAME_STMT);
except the one dealing with the multiplicative factor that was
gsi_insert_after (gsi, cond_stmt, GSI_CONTINUE_LINKING);
That commit for PR103208 fixed the issue of some missing regimplify of
operands of GIMPLE_CONDs by moving the condition handling to the new function
expand_omp_build_cond. While that function has an 'bool after = false'
argument to switch between the two variants.
However, all callers ommited this argument. This commit reinstates the
prior behavior by passing 'true' for the factor != 0 condition, fixing
the included testcase.
PR middle-end/111017
gcc/
* omp-expand.cc (expand_omp_for_init_vars): Pass after=true
to expand_omp_build_cond for 'factor != 0' condition, resulting
in pre-r12-5295-g47de0b56ee455e code for the gimple insert.
libgomp/
* testsuite/libgomp.c-c++-common/non-rect-loop-1.c: New test.
(cherry picked from commit 1dc65003b66e5a97200f454eeddcccfce34416b3)
|
|
We now got test coverage for non-SSA name bits so the following amends
the SSA_NAME_OCCURS_IN_ABNORMAL_PHI checks.
PR tree-optimization/111070
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check we have
an SSA name before checking SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
* gcc.dg/pr111070.c: New testcase.
(cherry picked from commit 966b0a96523fb7adbf498ac71df5e033c70dc546)
|
|
The following guards the bit test merging code in if-combine against
the appearance of SSA names used in abnormal PHIs.
PR tree-optimization/111039
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Check for
SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
* gcc.dg/pr111039.c: New testcase.
(cherry picked from commit 482551a79a3d3f107f6239679ee74655cfe8707e)
|
|
The following fixes a bad choice in representing things to the alias
oracle by LIM which while correct in pieces is inconsistent with itself.
When canonicalizing a ref to a bare deref instead of leaving the base
object and the extracted offset the same and just substituting an
alternate ref the following replaces the base and the offset as well,
avoiding the confusion that otherwise will arise in
aliasing_matching_component_refs_p.
PR tree-optimization/111019
* tree-ssa-loop-im.cc (gather_mem_refs_stmt): When canonicalizing
also scrap base and offset in case the ref is indirect.
* g++.dg/torture/pr111019.C: New testcase.
(cherry picked from commit 745ec2135aabfbe2c0fb7780309837d17e8986d4)
|
|
Sometimes IVOPTs chooses a weird induction variable which downstream
leads to issues. Most of the times we can fend those off during costing
by rejecting the candidate but it looks like the address description
costing synthesizes is different from what we end up generating so
the following fixes things up at code generation time. Specifically
we avoid the create_mem_ref_raw fallback which uses a literal zero
address base with the actual base in index2. For the case in question
we have the address
type = unsigned long
offset = 0
elements = {
[0] = &e * -3,
[1] = (sizetype) a.9_30 * 232,
[2] = ivtmp.28_44 * 4
}
from which we code generate the problematical
_3 = MEM[(long int *)0B + ivtmp.36_9 + ivtmp.28_44 * 4];
which references the object at address zero. The patch below
recognizes the fallback after the fact and transforms the
TARGET_MEM_REF memory reference into a LEA for which this form
isn't problematic:
_24 = &MEM[(long int *)0B + ivtmp.36_34 + ivtmp.28_44 * 4];
_3 = *_24;
hereby avoiding the correctness issue. We'd later conclude the
program terminates at the null pointer dereference and make the
function pure, miscompling the main function of the testcase.
PR tree-optimization/110702
* tree-ssa-loop-ivopts.cc (rewrite_use_address): When
we created a NULL pointer based access rewrite that to
a LEA.
* gcc.dg/torture/pr110702.c: New testcase.
(cherry picked from commit 13dfb01e5c30c3bd09333ac79d6ff96a617fea67)
|
|
The patterns that were added in r13-4620-g4d9db4bdd458, missed that
(a > b) and (a <= b) are not inverse of each other for floating point
comparisons (if NaNs are supported). Even though there was a check for
intergal types, it was only for the result of the cond rather for the
type of what is being compared. The fix is to check to see if cmp and
icmp are inverse of each other by using the invert_tree_comparison function.
OK for trunk and GCC 13 branch? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
I added the testcase to execute/ieee as it requires support for NAN.
PR tree-optimization/111109
gcc/ChangeLog:
* match.pd (ior(cond,cond), ior(vec_cond,vec_cond)):
Add check to make sure cmp and icmp are inverse.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/ieee/fp-cmp-cond-1.c: New test.
(cherry picked from commit 4aa14ec7d5b25722e4d02c29c8c1e22dcc5a4915)
|
|
The following applies some maintainance with respect to type qualifiers
and kinds added by later DWARF standards to prune_unused_types_walk.
The particular case in the bug is not handling (thus marking required)
all restrict qualified type DIEs. I've found more DW_TAG_*_type that
are unhandled, looked up the DWARF docs and added them as well based
on common sense.
PR debug/111080
* dwarf2out.cc (prune_unused_types_walk): Handle
DW_TAG_restrict_type, DW_TAG_shared_type, DW_TAG_atomic_type,
DW_TAG_immutable_type, DW_TAG_coarray_type, DW_TAG_unspecified_type
and DW_TAG_dynamic_type as to only output them when referenced.
* gcc.dg/debug/dwarf2/pr111080.c: New testcase.
(cherry picked from commit bd2c4d6d8fffd5a6dae5217d6076cc4190bab13d)
|
|
Both "graniterapid-d" and "graniterapids" are attached with
PROCESSOR_GRANITERAPID in processor_alias_table but mapped to
different __cpu_subtype in get_intel_cpu.
And get_builtin_code_for_version will try to match the first
PROCESSOR_GRANITERAPIDS in processor_alias_table which maps to
"granitepraids" here.
861 else if (new_target->arch_specified && new_target->arch > 0)
1862 for (i = 0; i < pta_size; i++)
1863 if (processor_alias_table[i].processor == new_target->arch)
1864 {
1865 const pta *arch_info = &processor_alias_table[i];
1866 switch (arch_info->priority)
1867 {
1868 default:
1869 arg_str = arch_info->name;
This mismatch makes dispatch_function_versions check the preidcate
of__builtin_cpu_is ("graniterapids") for "graniterapids-d" and causes
the issue.
The patch explicitly PROCESSOR_GRANITERAPIDS_D to make a distinction.
For "alderlake","raptorlake", "meteorlake" they share same isa, cost,
tuning, and mapped to the same __cpu_type/__cpu_subtype in
get_intel_cpu, so no need to add PROCESSOR_RAPTORLAKE and others.
gcc/ChangeLog:
* common/config/i386/i386-common.cc (processor_names): Add new
member graniterapids-s.
* config/i386/i386-options.cc (processor_alias_table): Update
PROCESSOR_GRANITERAPIDS_D.
(m_GRANITERAPID_D): New macro.
(m_CORE_AVX512): Add m_GRANITERAPIDS_D.
(processor_cost_table): Add icelake_cost for
PROCESSOR_GRANITERAPIDS_D.
* config/i386/i386.h (enum processor_type): Add new member
PROCESSOR_GRANITERAPIDS_D.
* config/i386/i386-c.cc (ix86_target_macros_internal): Handle
PROCESSOR_GRANITERAPIDS_D.
(cherry picked from commit afe15e9742d9fefb3f4a9b1662cb3f977e3645fd)
|
|
gcc/ChangeLog:
* config/i386/i386.cc (ix86_invalid_conversion): Adjust GCC
V13 to GCC 13.1.
(cherry picked from commit 0a888650303750fd72878fc083dfb30b62e30809)
|
|
|