Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds a combine pass that runs late in the pipeline.
There are two instances: one between combine and split1, and one
after postreload.
The pass currently has a single objective: remove definitions by
substituting into all uses. The pre-RA version tries to restrict
itself to cases that are likely to have a neutral or beneficial
effect on register pressure.
The patch fixes PR106594. It also fixes a few FAILs and XFAILs
in the aarch64 test results, mostly due to making proper use of
MOVPRFX in cases where we didn't previously.
This is just a first step. I'm hoping that the pass could be
used for other combine-related optimisations in future. In particular,
the post-RA version doesn't need to restrict itself to cases where all
uses are substitutable, since it doesn't have to worry about register
pressure. If we did that, and if we extended it to handle multi-register
REGs, the pass might be a viable replacement for regcprop, which in
turn might reduce the cost of having a post-RA instance of the new pass.
On most targets, the pass is enabled by default at -O2 and above.
However, it has a tendency to undo x86's STV and RPAD passes,
by folding the more complex post-STV/RPAD form back into the
simpler pre-pass form.
Also, running a pass after register allocation means that we can
now match define_insn_and_splits that were previously only matched
before register allocation. This trips things like:
(define_insn_and_split "..."
[...pattern...]
"...cond..."
"#"
"&& 1"
[...pattern...]
{
...unconditional use of gen_reg_rtx ()...;
}
because matching and splitting after RA will call gen_reg_rtx when
pseudos are no longer allowed. rs6000 has several instances of this.
xtensa has a variation in which the split condition is:
"&& can_create_pseudo_p ()"
The failure then is that, if we match after RA, we'll never be
able to split the instruction.
The patch therefore disables the pass by default on i386, rs6000
and xtensa. Hopefully we can fix those ports later (if their
maintainers want). It seems better to add the pass first, though,
to make it easier to test any such fixes.
gcc.target/aarch64/bitfield-bitint-abi-align{16,8}.c would need
quite a few updates for the late-combine output. That might be
worth doing, but it seems too complex to do as part of this patch.
I tried compiling at least one target per CPU directory and comparing
the assembly output for parts of the GCC testsuite. This is just a way
of getting a flavour of how the pass performs; it obviously isn't a
meaningful benchmark. All targets seemed to improve on average:
Target Tests Good Bad %Good Delta Median
====== ===== ==== === ===== ===== ======
aarch64-linux-gnu 2215 1975 240 89.16% -4159 -1
aarch64_be-linux-gnu 1569 1483 86 94.52% -10117 -1
alpha-linux-gnu 1454 1370 84 94.22% -9502 -1
amdgcn-amdhsa 5122 4671 451 91.19% -35737 -1
arc-elf 2166 1932 234 89.20% -37742 -1
arm-linux-gnueabi 1953 1661 292 85.05% -12415 -1
arm-linux-gnueabihf 1834 1549 285 84.46% -11137 -1
avr-elf 4789 4330 459 90.42% -441276 -4
bfin-elf 2795 2394 401 85.65% -19252 -1
bpf-elf 3122 2928 194 93.79% -8785 -1
c6x-elf 2227 1929 298 86.62% -17339 -1
cris-elf 3464 3270 194 94.40% -23263 -2
csky-elf 2915 2591 324 88.89% -22146 -1
epiphany-elf 2399 2304 95 96.04% -28698 -2
fr30-elf 7712 7299 413 94.64% -99830 -2
frv-linux-gnu 3332 2877 455 86.34% -25108 -1
ft32-elf 2775 2667 108 96.11% -25029 -1
h8300-elf 3176 2862 314 90.11% -29305 -2
hppa64-hp-hpux11.23 4287 4247 40 99.07% -45963 -2
ia64-linux-gnu 2343 1946 397 83.06% -9907 -2
iq2000-elf 9684 9637 47 99.51% -126557 -2
lm32-elf 2681 2608 73 97.28% -59884 -3
loongarch64-linux-gnu 1303 1218 85 93.48% -13375 -2
m32r-elf 1626 1517 109 93.30% -9323 -2
m68k-linux-gnu 3022 2620 402 86.70% -21531 -1
mcore-elf 2315 2085 230 90.06% -24160 -1
microblaze-elf 2782 2585 197 92.92% -16530 -1
mipsel-linux-gnu 1958 1827 131 93.31% -15462 -1
mipsisa64-linux-gnu 1655 1488 167 89.91% -16592 -2
mmix 4914 4814 100 97.96% -63021 -1
mn10300-elf 3639 3320 319 91.23% -34752 -2
moxie-rtems 3497 3252 245 92.99% -87305 -3
msp430-elf 4353 3876 477 89.04% -23780 -1
nds32le-elf 3042 2780 262 91.39% -27320 -1
nios2-linux-gnu 1683 1355 328 80.51% -8065 -1
nvptx-none 2114 1781 333 84.25% -12589 -2
or1k-elf 3045 2699 346 88.64% -14328 -2
pdp11 4515 4146 369 91.83% -26047 -2
pru-elf 1585 1245 340 78.55% -5225 -1
riscv32-elf 2122 2000 122 94.25% -101162 -2
riscv64-elf 1841 1726 115 93.75% -49997 -2
rl78-elf 2823 2530 293 89.62% -40742 -4
rx-elf 2614 2480 134 94.87% -18863 -1
s390-linux-gnu 1591 1393 198 87.55% -16696 -1
s390x-linux-gnu 2015 1879 136 93.25% -21134 -1
sh-linux-gnu 1870 1507 363 80.59% -9491 -1
sparc-linux-gnu 1123 1075 48 95.73% -14503 -1
sparc-wrs-vxworks 1121 1073 48 95.72% -14578 -1
sparc64-linux-gnu 1096 1021 75 93.16% -15003 -1
v850-elf 1897 1728 169 91.09% -11078 -1
vax-netbsdelf 3035 2995 40 98.68% -27642 -1
visium-elf 1392 1106 286 79.45% -7984 -2
xstormy16-elf 2577 2071 506 80.36% -13061 -1
gcc/
PR rtl-optimization/106594
PR rtl-optimization/114515
PR rtl-optimization/114575
PR rtl-optimization/114996
PR rtl-optimization/115104
* Makefile.in (OBJS): Add late-combine.o.
* common.opt (flate-combine-instructions): New option.
* doc/invoke.texi: Document it.
* opts.cc (default_options_table): Enable it by default at -O2
and above.
* tree-pass.h (make_pass_late_combine): Declare.
* late-combine.cc: New file.
* passes.def: Add two instances of late_combine.
* doc/passes.texi: Document the new passes.
* config/i386/i386-options.cc (ix86_override_options_after_change):
Disable late-combine by default.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Likewise.
* config/xtensa/xtensa.cc (xtensa_option_override): Likewise.
gcc/testsuite/
PR rtl-optimization/106594
* gcc.dg/ira-shrinkwrap-prep-1.c: Restrict XFAIL to non-aarch64
targets.
* gcc.dg/ira-shrinkwrap-prep-2.c: Likewise.
* gcc.dg/stack-check-4.c: Add -fno-shrink-wrap.
* gcc.target/aarch64/bitfield-bitint-abi-align16.c: Add
-fno-late-combine-instructions.
* gcc.target/aarch64/bitfield-bitint-abi-align8.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_3.c: Remove XFAILs.
* gcc.target/aarch64/sve/cond_convert_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fabd_5.c: Likewise.
* gcc.target/aarch64/sve/cond_convert_6.c: Expect the MOVPRFX /Zs
described in the comment.
* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
* gcc.target/aarch64/pr106594_1.c: New test.
|
|
rtl-ssa has routines for scanning forwards or backwards for something
under the control of an exclusion set. These searches are currently
used for two main things:
- to work out where an instruction can be moved within its EBB
- to work out whether recog can add a new hard register clobber
The exclusion set was originally a callback function that returned
true for insns that should be ignored. However, for the late-combine
work, I'd also like to be able to skip an entire definition, along
with all its uses.
This patch prepares for that by turning the exclusion set into an
object that provides predicate member functions. Currently the
only two member functions are:
- should_ignore_insn: what the old callback did
- should_ignore_def: the new functionality
but more could be added later.
Doing this also makes it easy to remove some asymmetry that I think
in hindsight was a mistake: in forward scans, ignoring an insn meant
ignoring all definitions in that insn (ok) and all uses of those
definitions (non-obvious). The new interface makes it possible
to select the required behaviour, with that behaviour being applied
consistently in both directions.
Now that the exclusion set is a dedicated object, rather than
just a "random" function, I think it makes sense to remove the
_ignoring suffix from the function names. The suffix was originally
there to describe the callback, and in particular to emphasise that
a true return meant "ignore" rather than "heed".
gcc/
* rtl-ssa.h: Include predicates.h.
* rtl-ssa/predicates.h: New file.
* rtl-ssa/access-utils.h (prev_call_clobbers_ignoring): Rename to...
(prev_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(next_call_clobbers_ignoring): Rename to...
(next_call_clobbers): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(first_nondebug_insn_use_ignoring): Rename to...
(first_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_nondebug_insn_use_ignoring): Rename to...
(last_nondebug_insn_use): ...this and treat the ignore parameter as
an object with the same interface as ignore_nothing.
(last_access_ignoring): Rename to...
(last_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
(prev_access_ignoring): Rename to...
(prev_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing.
(first_def_ignoring): Replace with...
(first_access): ...this new function.
(next_access_ignoring): Rename to...
(next_access): ...this and treat the ignore parameter as an object
with the same interface as ignore_nothing. Conditionally skip
definitions.
* rtl-ssa/change-utils.h (insn_is_changing): Delete.
(restrict_movement_ignoring): Rename to...
(restrict_movement): ...this and treat the ignore parameter as an
object with the same interface as ignore_nothing.
(recog_ignoring): Rename to...
(recog): ...this and treat the ignore parameter as an object with
the same interface as ignore_nothing.
* rtl-ssa/changes.h (insn_is_changing_closure): Delete.
* rtl-ssa/functions.h (function_info::add_regno_clobber): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
* rtl-ssa/insn-utils.h (insn_is): Delete.
* rtl-ssa/insns.h (insn_is_closure): Delete.
* rtl-ssa/member-fns.inl
(insn_is_changing_closure::insn_is_changing_closure): Delete.
(insn_is_changing_closure::operator()): Likewise.
(function_info::add_regno_clobber): Treat the ignore parameter
as an object with the same interface as ignore_nothing.
(ignore_changing_insns::ignore_changing_insns): New function.
(ignore_changing_insns::should_ignore_insn): Likewise.
* rtl-ssa/movement.h (restrict_movement_for_dead_range): Treat
the ignore parameter as an object with the same interface as
ignore_nothing.
(restrict_movement_for_defs_ignoring): Rename to...
(restrict_movement_for_defs): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing.
(restrict_movement_for_uses_ignoring): Rename to...
(restrict_movement_for_uses): ...this and treat the ignore parameter
as an object with the same interface as ignore_nothing. Conditionally
skip definitions.
* doc/rtl.texi: Update for above name changes. Use
ignore_changing_insns instead of insn_is_changing.
* config/aarch64/aarch64-cc-fusion.cc (cc_fusion::parallelize_insns):
Likewise.
* pair-fusion.cc (no_ignore): Delete.
(latest_hazard_before, first_hazard_after): Update for above name
changes. Use ignore_nothing instead of no_ignore.
(pair_fusion_bb_info::fuse_pair): Update for above name changes.
Use ignore_changing_insns instead of insn_is_changing.
(pair_fusion::try_promote_writeback): Likewise.
|
|
The compare_repeat_factors comparator fails qsort checking eventually
because it uses rf2->rank - rf1->rank to compare unsigned numbers
which causes issues for ranks that interpret negative as signed.
Fixed by re-writing the obvious way. I've also fixed the count
comparison which suffers from truncation as count is 64bit signed
while the comparator result is 32bit int (that's a lot less likely
to hit in practice though).
The testcase from the PR is too large to include.
PR tree-optimization/115599
* tree-ssa-reassoc.cc (compare_repeat_factors): Use explicit
compares to avoid truncations.
|
|
gcc/
PR target/113325
* config/rs6000/vsx.md (vsx_stxvd2x4_le_const_<mode>): New.
gcc/testsuite/
PR target/113325
* gcc.target/powerpc/pr113325.c: New.
|
|
gcc/
* fwprop.cc (try_fwprop_subst_pattern): Invoke change_is_worthwhile
to judge if a replacement is worthwhile. Remove single_set check
and add is_debug_insn check.
* recog.cc (swap_change): Invalidate recog_data when the cached INSN
is swapped out.
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Check if the
insn cost of new rtl is unknown and fail the replacement.
|
|
Translates DW_TAG_enumeration_type DIEs into LF_ENUM symbols.
gcc/
* dwarf2codeview.cc (MAX_FIELDLIST_SIZE): Define.
(struct codeview_integer): New structure.
(struct codeview_subtype): Likewise
(struct codeview_custom_type): Add lf_fieldlist and lf_enum to union.
(write_cv_integer, cv_integer_len): New functions.
(write_lf_fieldlist, write_lf_enum): Likewise.
(write_custom_types): Call write_lf_fieldlist and write_lf_enum.
(add_enum_forward_def): New function.
(get_type_num_enumeration_type): Likewise.
(get_type_num): Handle DW_TAG_enumeration_type DIEs.
* dwarf2codeview.h (LF_FIELDLIST, LF_INDEX, LF_ENUMERATE): Define.
(LF_ENUM, LF_CHAR, LF_SHORT, LF_USHORT, LF_LONG): Likewise.
(LF_ULONG, LF_QUADWORD, LF_UQUADWORD): Likewise.
(CV_ACCESS_PRIVATE, CV_ACCESS_PROTECTED): Likewise.
(CV_ACCESS_PUBLIC, CV_PROP_FWDREF): Likewise.
|
|
gcc/
* dwarf2codeview.cc
(struct codeview_custom_type): Add lf_modifier to union.
(write_cv_padding, write_lf_modifier): New functions.
(write_custom_types): Call write_lf_modifier.
(get_type_num_const_type): New function.
(get_type_num_volatile_type): Likewise.
(get_type_num): Handle DW_TAG_const_type and DW_TAG_volatile_type DIEs.
* dwarf2codeview.h (MOD_const, MOD_volatile): Define.
(LF_MODIFIER): Likewise.
|
|
Translates DW_TAG_pointer_type DIEs into LF_POINTER symbols, which get
output into the .debug$T section.
gcc/
* dwarf2codeview.cc (FIRST_TYPE): Define.
(struct codeview_custom_type): New structure.
(custom_types, last_custom_type): New variables.
(get_type_num): Prototype.
(write_lf_pointer, write_custom_types): New functions.
(codeview_debug_finish): Call write_custom_types.
(add_custom_type, get_type_num_pointer_type): New functions.
(get_type_num): Handle DW_TAG_pointer_type DIEs.
* dwarf2codeview.h (T_VOID): Define.
(CV_POINTER_32, CV_POINTER_64): Likewise.
(T_32PVOID, T_64PVOID): Likewise.
(CV_PTR_NEAR32, CV_PTR64, LF_POINTER): Likewise.
|
|
gcc/
* dwarf2codeview.cc (get_type_num): Handle typedefs.
|
|
Adds a get_type_num function to translate type DIEs into CodeView
numbers, along with a hash table for this. For now we just deal with
the base types (integers, Unicode chars, floats, and bools).
gcc/
* dwarf2codeview.cc (struct codeview_type): New structure.
(struct die_hasher): Likewise.
(types_htab): New variable.
(codeview_debug_finish): Free types_htab if allocated.
(get_type_num_base_type, get_type_num): New function.
(add_variable): Call get_type_num.
* dwarf2codeview.h (T_CHAR, T_SHORT, T_LONG, T_QUAD): Define.
(T_UCHAR, T_USHORT, T_ULONG, T_UQUAD, T_BOOL08): Likewise.
(T_REAL32, T_REAL64, T_REAL80, T_REAL128, T_RCHAR): Likewise.
(T_WCHAR, T_INT4, T_UINT4, T_CHAR16, T_CHAR32, T_CHAR8): Likewise.
|
|
|
|
Parse the DW_TAG_variable DIEs, and outputs S_GDATA32 (for global variables)
and S_LDATA32 (static global variables) symbols into the .debug$S section.
gcc/
* dwarf2codeview.cc (S_LDATA32, S_GDATA32): Define.
(struct codeview_symbol): New structure.
(sym, last_sym): New variables.
(write_data_symbol): New function.
(write_codeview_symbols): Call write_data_symbol.
(add_variable, codeview_debug_early_finish): New functions.
* dwarf2codeview.h (codeview_debug_early_finish): Prototype.
* dwarf2out.cc
(dwarf2out_early_finish): Call codeview_debug_early_finish.
|
|
Presently, the code fragment:
int x[5];
void
d(int a, int b, int c) {
for (int i = 0; i < 5; i++)
x[i] = (a != b) ? c : a;
}
causes an ICE when compiled with -O2 -march=rv32i_zicond:
test.c: In function 'd':
test.c: error: unrecognizable insn:
11 | }
| ^
(insn 8 5 9 2 (set (reg:SI 139 [ iftmp.0_2 ])
(if_then_else:SI (ne:SI (reg/v:SI 136 [ a ])
(reg/v:SI 137 [ b ]))
(reg/v:SI 136 [ a ])
(reg/v:SI 138 [ c ]))) -1
(nil))
during RTL pass: vregs
This happens because, as part of one of the optimizations in
riscv_expand_conditional_move(), an if_then_else is generated with both
comparands being register operands, resulting in an unmatchable insn since
Zicond patterns require constant 0 as the second comparand. Fix this by adding
a extra check before performing this optimization.
The code snippet mentioned above is also included in this patch as a new Zicond
testcase.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_expand_conditional_move): Add a
CONST0_RTX check.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zicond-ice-5.c: New test.
|
|
extracting INTVAL
Run-of-the-mill checking issue. We had something like (plus (reg) (reg)) and
tried to extract INTVAL (XEXP (x, 1)) which of course blows up with checking
on.
Fixed thusly. Tested on riscv32-elf in my tester. riscv64-elf is in flight,
but won't finish for a while due to other tasks in flight.
PR target/114139
gcc/
* config/riscv/riscv.cc (riscv_macro_fusion_pair_p): Verify object
is a CONST_INT before looking at INTVAL.
gcc/testsuite/
* gcc.target/riscv/pr114139.c: New test.
|
|
The following makes sure to always CSE when there's SLP_TREE_SCALAR_STMTS
as otherwise a chain of two-operator node operations can result in
exponential behavior of the CSE process as likely seen when building
510.parest on aarch64.
PR tree-optimization/115597
* tree-vect-slp.cc (vect_cse_slp_nodes): Allow to CSE
VEC_PERM nodes.
|
|
The recent change to relax store motion for variables that cannot have
store data races broke the optimization to share flag vars for stores
that all happen in the same single BB. The following fixes this.
PR tree-optimization/115579
* tree-ssa-loop-im.cc (execute_sm): Return the auxiliary data
created.
(hoist_memory_references): Record the flag var that's eventually
created and re-use it when all stores are in the same BB.
* gcc.dg/pr115579.c: New testcase.
|
|
A shift of 31 on a signed int is undefined behavior. Since unsigned
int is 32-bits wide this change fixes it and silences the warning.
gcc/ChangeLog:
PR target/115409
* config/i386/avx512fp16intrin.h (_mm512_conj_pch): Make the
constant unsigned before shifting.
* config/i386/avx512fp16vlintrin.h (_mm256_conj_pch): Likewise.
(_mm_conj_pch): Likewise.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
|
|
These tests check the sched2 dump, so skip them for optimization levels
that do not enable sched2.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/mcpu-6.c: Skip for -O0, -O1, -Og.
* gcc.target/riscv/mcpu-7.c: Likewise.
|
|
We can unify eqne and other comparison operations.
Tested on RV32 and RV64.
gcc/ChangeLog:
* config/riscv/predicates.md (comparison_except_eqge_operator): Only
exclude ge.
(comparison_except_ge_operator): Ditto.
* config/riscv/riscv-string.cc (expand_rawmemchr): Use cmp pattern.
(expand_strcmp): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc: Remove eqne cond.
* config/riscv/vector.md (@pred_eqne<mode>_scalar): Remove eqne
patterns.
(*pred_eqne<mode>_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_scalar): Ditto.
(*pred_eqne<mode>_scalar_narrow): Ditto.
(*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto.
(*pred_eqne<mode>_extended_scalar): Ditto.
(*pred_eqne<mode>_extended_scalar_narrow): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/integer-cmp-eqne.c: New test.
|
|
|
|
> the test should probably also be skipped on -Oz:
>
> === gcc: Unexpected fails for rv64imafdc lp64d medlow ===
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andi\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times andn\t 1
> FAIL: gcc.target/riscv/zbs-ext-2.c -Oz scan-assembler-times li\t 1
Yea. Just re-ran thing and sure enough we need to skip -Oz as well. So
committing the obvious change....
gcc/testsuite/
* gcc.target/riscv/zbs-ext-2.c: Also skip for -Oz.
|
|
|
|
No functional change intended.
gcc/ChangeLog:
* diagnostic-format-json.cc
(json_output_format::on_end_diagnostic): Use
get_diagnostic_kind_text rather than embedding a duplicate copy of
the table.
* diagnostic-format-sarif.cc
(make_rule_id_for_diagnostic_kind): Likewise.
* diagnostic.cc (get_diagnostic_kind_text): New.
* diagnostic.h (get_diagnostic_kind_text): New decl.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
No functional change intended.
gcc/ChangeLog:
* diagnostic-path.cc (diagnostic_event::meaning::dump_to_pp): Move
here from diagnostic.cc.
(diagnostic_event::meaning::maybe_get_verb_str): Likewise.
(diagnostic_event::meaning::maybe_get_noun_str): Likewise.
(diagnostic_event::meaning::maybe_get_property_str): Likewise.
(diagnostic_path::get_first_event_in_a_function): Likewise.
(diagnostic_path::interprocedural_p): Likewise.
(debug): Likewise for diagnostic_path * overload.
* diagnostic.cc (diagnostic_event::meaning::dump_to_pp): Move from
here to diagnostic-path.cc.
(diagnostic_event::meaning::maybe_get_verb_str): Likewise.
(diagnostic_event::meaning::maybe_get_noun_str): Likewise.
(diagnostic_event::meaning::maybe_get_property_str): Likewise.
(diagnostic_path::get_first_event_in_a_function): Likewise.
(diagnostic_path::interprocedural_p): Likewise.
(debug): Likewise for diagnostic_path * overload.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
More minor fallout from the IOR->PLUS change a little while ago. This time on
xstormy16.
The pattern to swap nibbles actually tries to handle all the cases of IOR, XOR
and PLUS. But when we generate PLUS earlier in the pipeline, the
simplifications/canonicalizations are slightly different resulting in the
pattern not matching.
This patch adds an alternate pattern which matches what we get now. Basically
it looks like QImode rotate by 4, zero extended to HI.
Run in my tester to verify the regression was fixed. Pushing to the trunk.
gcc/
* config/stormy16/stormy16.md (swpn_zext): New pattern.
|
|
All uses of xs_hi_nonmemory_operand allow constraint "i",
which means that they allow consts, symbol_refs and label_refs.
The definition of xs_hi_nonmemory_operand accounted for consts,
but not for symbol_refs and label_refs.
gcc/
* config/stormy16/predicates.md (xs_hi_nonmemory_operand): Handle
symbol_ref and label_ref.
|
|
The iq2000 test and branch instructions had patterns like:
[(set (pc)
(if_then_else
(eq (and:SI (match_operand:SI 0 "register_operand" "r")
(match_operand:SI 1 "power_of_2_operand" "I"))
(const_int 0))
(match_operand 2 "pc_or_label_operand" "")
(match_operand 3 "pc_or_label_operand" "")))]
power_of_2_operand allows any 32-bit power of 2, whereas "I" only
accepts 16-bit signed constants. This meant that any power of 2
greater than 32768 would cause an "insn does not satisfy its
constraints" ICE.
Also, the %p operand modifier barfed on 1<<31, which is sign-
rather than zero-extended to 64 bits. The code is inherently
limited to 32-bit operands -- power_of_2_operand contains a test
involving "unsigned" -- so this patch just ands with 0xffffffff.
gcc/
* config/iq2000/iq2000.cc (iq2000_print_operand): Make %p handle 1<<31.
* config/iq2000/iq2000.md: Remove "I" constraints on
power_of_2_operands.
|
|
No-op moves are given the code NOOP_MOVE_INSN_CODE if we plan
to delete them later. Such insns shouldn't be costed, partly
because they're going to disappear, and partly because targets
won't recognise the insn code.
gcc/
* rtl-ssa/changes.cc (rtl_ssa::changes_are_worthwhile): Don't
cost no-op moves.
* rtl-ssa/insns.cc (insn_info::calculate_cost): Likewise.
|
|
* gimple-range.cc (gimple_ranger::register_inferred_ranges): Do not
dump global range info after set_range_info.
(gimple_ranger::register_transitive_inferred_ranges): Likewise.
(dom_ranger::range_of_stmt): Likewise.
* tree-ssanames.cc (set_range_info): If global range info
changes, maybe print new range to dump_file.
* tree-vrp.cc (remove_unreachable::handle_early): Do not
dump global range info after set_range_info.
(remove_unreachable::remove): Likewise.
(remove_unreachable::remove_and_update_globals): Likewise.
(pass_assumptions::execute): Likewise.
|
|
Change the fast VRP algorithm to track contextual ranges active within
each basic block.
* gimple-range.cc (dom_ranger::dom_ranger): Create a block
vector.
(dom_ranger::~dom_ranger): Dispose of the block vector.
(dom_ranger::edge_range): Delete.
(dom_ranger::range_on_edge): Combine range in src BB with any
range gori_nme_on_edge returns.
(dom_ranger::range_in_bb): Combine global range with any active
contextual range for an ssa-name.
(dom_ranger::range_of_stmt): Fix non-ssa LHS case, use
fur_depend for folding so relations can be registered.
(dom_ranger::maybe_push_edge): Delete.
(dom_ranger::pre_bb): Create incoming contextual range vector.
(dom_ranger::post_bb): Free contextual range vector.
* gimple-range.h (dom_ranger::edge_range): Delete.
(dom_ranger::m_e0): Delete.
(dom_ranger::m_e1): Delete.
(dom_ranger::m_bb): New.
(dom_ranger::m_pop_list): Delete.
* tree-vrp.cc (execute_fast_vrp): Enable relation oracle.
|
|
Add a remove_unreachable object to fast vrp, and honor the final_p flag.
* tree-vrp.cc (remove_unreachable::remove): Export global range
if builtin_unreachable dominates all uses.
(remove_unreachable::remove_and_update_globals): Do not reset SCEV.
(execute_ranger_vrp): Reset SCEV here instead.
(fvrp_folder::fvrp_folder): Take final pass flag
and create a remove_unreachable object when specified.
(fvrp_folder::pre_fold_stmt): Register GIMPLE_CONDs with
the remove_unreachcable object.
(fvrp_folder::m_unreachable): New.
(execute_fast_vrp): Process remove_unreachable object.
(pass_vrp::execute): Add final_p flag to execute_fast_vrp.
|
|
schema [PR109360]
This patch extends the dg directive verify-sarif-file so that if
the "jsonschema" tool is available, it will be used to validate the
generated .sarif file.
Tested with jsonschema 3.2 with Python 3.8
gcc/ChangeLog:
PR testsuite/109360
* doc/install.texi: Mention optional usage of "jsonschema" tool.
gcc/testsuite/ChangeLog:
PR testsuite/109360
* lib/sarif-schema-2.1.0.json: New file, downloaded from
https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/schemas/sarif-schema-2.1.0.json
Licensing information can be seen at
https://github.com/oasis-tcs/sarif-spec/issues/583
which states "They are free to incorporate it into their
implementation. No need for special permission or paperwork from
OASIS."
* lib/scansarif.exp (verify-sarif-file): If "jsonschema" is
available, use it to verify that the .sarif file complies with the
SARIF schema.
* lib/target-supports.exp (check_effective_target_jsonschema):
New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.
Specifically, in
c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.
Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet). We were
handling this column override for plain text output, but not for .sarif
output.
Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.
We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.
This patch fixes these various issues.
gcc/ChangeLog:
PR testsuite/109360
* diagnostic-format-sarif.cc
(sarif_builder::make_location_object): Pass any column override
from rich_loc to maybe_make_physical_location_object.
(sarif_builder::maybe_make_physical_location_object): Add
"column_override" param and pass it to maybe_make_region_object.
(sarif_builder::maybe_make_region_object): Add "column_override"
param and use it when the location has 0 for a column. Don't
add "startLine", "startColumn", "endLine", or "endColumn" if
the values aren't positive.
(sarif_builder::maybe_make_region_object_for_context): Don't
add "startLine" or "endLine" if the values aren't positive.
libcpp/ChangeLog:
PR testsuite/109360
* include/rich-location.h (rich_location::get_column_override):
New accessor.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
*minus_plus_one had no constraints, which meant that it could be
matched after RA with operands 0, 1 and 2 all being different.
The associated split instead requires operand 0 to be tied to
operand 1.
gcc/
* config/sh/sh.md (*minus_plus_one): Add constraints.
|
|
It occurs when the body of a protected subprogram is processed, because the
references to the components of the type have not been properly expanded.
gcc/ada/
* gcc-interface/trans.cc (Subprogram_Body_to_gnu): Also return early
for a protected subprogram in -gnatc mode.
|
|
The Address Sanitizer considers that the padding at the end of a justified
modular type may be accessed through the object, but it is never accessed
and therefore can always be reused.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_entity) <discrete_type>: Set
the TYPE_JUSTIFIED_MODULAR_P flag earlier.
* gcc-interface/misc.cc (gnat_unit_size_without_reusable_padding):
New function.
(LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING): Redefine to above
function.
|
|
We set DECL_BIT_FIELD optimistically during the translation of record types
and clear it afterward if needed, but fail to clear other attributes in the
latter case, which fools the logic of the Address Sanitizer.
gcc/ada/
* gcc-interface/utils.cc (clear_decl_bit_field): New function.
(finish_record_type): Call clear_decl_bit_field instead of clearing
DECL_BIT_FIELD manually.
|
|
This adds the missing guard to prevent the reduction from being used when
the target does not provide or cannot synthesize a high-part multiply.
gcc/ada/
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: Fix formatting.
* gcc-interface/utils2.cc: Include optabs-query.h.
(fast_modulo_reduction): Call can_mult_highpart_p on the TYPE_MODE
before generating a high-part multiply. Fix formatting.
|
|
This implements modulo reduction for nonbinary modular multiplication with
small moduli by means of the standard division-free algorithm also used in
the optimizer, but with fewer constraints and therefore better results.
For the sake of consistency, it is also used for the 'Mod attribute of the
same modular types and, more generally, for the Mod (and Rem) operators of
unsigned types if the second operand is static and not a power of two.
gcc/ada/
* gcc-interface/gigi.h (fast_modulo_reduction): Declare.
* gcc-interface/trans.cc (gnat_to_gnu) <N_Op_Mod>: In the unsigned
case, call fast_modulo_reduction for {FLOOR,TRUNC}_MOD_EXPR if the
RHS is a constant and not a power of two, and the precision is not
larger than the word size.
* gcc-interface/utils2.cc: Include expmed.h.
(fast_modulo_reduction): New function.
(nonbinary_modular_operation): Call fast_modulo_reduction for the
multiplication if the precision is not larger than the word size.
|
|
When the interpolated expression is a call to an ambiguous call
the frontend does not reject it; erroneously accepts the call
and generates code that calls to one of them.
gcc/ada/
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Reject
ambiguous function calls.
|
|
gcc/ada/
* sem_util.adb (Examine_Array_Bounds): Add missing return
statements. Fix criterion for a string literal being empty.
|
|
This is the minimal fix to avoid the crash.
gcc/ada/
* bcheck.adb (Check_Consistency_Of_Sdep): Guard against path to ALI
file not found.
|
|
When a non-overridable aspect is explicitly specified for a
non-tagged derived type, the compiler blows up processing an
object declaration of an object of such type.
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Fix code locating the entity
of the parent type.
|
|
Include the invalid path in the error message.
gcc/ada/
* make.adb (Scan_Make_Arg): Adjust error message.
* gnatls.adb (Search_RTS): Likewise.
* switch-b.adb (Scan_Debug_Switches): Likewise.
|
|
The processing of primitive operations is now always uniform for tagged and
untagged types, but the code contains left-overs from the time where it was
specific to tagged types, in particular for the handling of subtypes.
gcc/ada/
* einfo.ads (Direct_Primitive_Operations): Mention concurrent types
as well as GNAT extensions instead of implementation details.
(Primitive_Operations): Document that Direct_Primitive_Operations is
also used for concurrent types as a fallback.
* einfo-utils.adb (Primitive_Operations): Tweak formatting.
* exp_util.ads (Find_Prim_Op): Adjust description.
* exp_util.adb (Make_Subtype_From_Expr): In the private case with
unknown discriminants, always copy Direct_Primitive_Operations and
do not overwrite the Class_Wide_Type of the expression's base type.
* sem_ch3.adb (Analyze_Incomplete_Type_Decl): Tweak comment.
(Analyze_Subtype_Declaration): Remove older and now dead calls to
Set_Direct_Primitive_Operations. Tweak comment.
(Build_Derived_Private_Type): Likewise.
(Build_Derived_Record_Type): Likewise.
(Build_Discriminated_Subtype): Set Direct_Primitive_Operations in
all cases instead of just for tagged types.
(Complete_Private_Subtype): Likewise.
(Derived_Type_Declaration): Tweak comment.
* sem_ch4.ads (Try_Object_Operation): Adjust description.
|
|
The conditional installation resulted in a semantic change, and
although it is likely what is ultimately wanted (since HW interrupts
are being reworked on VxWorks). However it must be done in concert
with other modifications for the new formulation of HW interrupts and
not in isolation.
gcc/ada/
* init.c [vxworks] (__gnat_install_handler): Revert to
installing signal handlers without regard to interrupt_state.
|
|
When a package has the declaration of a derived tagged
type T with private null extension that inherits a public
function F with controlling result, and a derivation of T
is declared in the public part of another package, overriding
function F may be rejected by the compiler.
gcc/ada/
* sem_disp.adb (Find_Hidden_Overridden_Primitive): Check
public dispatching primitives of ancestors; previously,
only immediately-visible primitives were checked.
|
|
The Do_Range_Check flag is properly set on the Expression of the EWA node
built for the declare expression, so this instructs Generate_Index_Checks
to look into this Expression.
gcc/ada/
* checks.adb (Generate_Index_Checks): Add specific treatment for
index expressions that are N_Expression_With_Actions nodes.
|
|
This occurs when the bounds of the array component depend on a discriminant
and the component reference is not nested, that is to say the component is
not (referenced as) a subcomponent of a larger record.
In this case, Analyze_Selected_Component does not build the actual subtype
for the component, but it turns out to be required for constructs generated
during the analysis of the case expression.
The change causes this actual subtype to be built, and also renames a local
variable used to hold the prefix of the selected component.
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): Rename Name into Pref
and use Sel local variable consistently.
(Is_Simple_Indexed_Component): New predicate.
Call Is_Simple_Indexed_Component to determine whether to build an
actual subtype for the component.
|
|
The problem is that the handling of the interaction between packing and
aliased/atomic/independent components of an array type is tied to that of
the interaction between a component clause and aliased/atomic/independent
components, although the semantics are different: packing is a best effort
thing, whereas a component clause must be honored or else an error be given.
This decouples the two handlings, but retrofits the separate processing of
independent components done in both cases into the common code and changes
the error message from "minimum allowed is" to "minimum allowed value is"
for the sake of consistency with the aliased/atomic processing.
gcc/ada/
* freeze.adb (Freeze_Array_Type): Decouple the handling of the
interaction between packing and aliased/atomic components from
that of the interaction between a component clause and aliased/
atomic components, and retrofit the processing of the interaction
between the two characteristics and independent components into
the common processing.
gcc/testsuite/ChangeLog:
* gnat.dg/atomic10.adb: Adjust.
|