Age | Commit message (Collapse) | Author | Files | Lines |
|
Add support for some recent AVR devices.
gcc/
* config/avr/avr-mcus.def: Add avr32eb14, avr32eb20,
avr32eb28, avr32eb32.
* doc/avr-mmcu.texi: Rebuild.
|
|
gcc/
PR target/121794
* config/avr/avr.md (cmpqi3): Use cpi R,0 if possible.
|
|
There are some cases where involing zero_reg is not needed and
where there are other sequences with the same efficiency.
An example is to use SBCI R,0 instead of SBC R,__zero_reg__
when R >= R16. This may turn out to be better for small ISRs.
PR target/121794
gcc/
* config/avr/avr.cc (avr_out_compare): Only use zero_reg
when there is no other sequence of the same length.
(avr_out_plus_ext): Same.
(avr_out_plus_1): Same.
|
|
The linker rejects --relax in relocatable links (-r), hence only
add --relax when -r is not specified.
gcc/
PR target/121608
* config/avr/specs.h (LINK_RELAX_SPEC): Wrap in %{!r...}.
|
|
gcc/
* config/avr/avr.cc (avr_rtx_costs_1) [SIGN_EXTEND]: Adjust cost.
* config/avr/avr.md (*sext.ashift<QIPSI:mode><HISI:mode>2): New
insn and a cc split.
|
|
PR target/121359
gcc/
* config/avr/avr.h: Remove -mlra and remains of reload.
* config/avr/avr.cc: Same.
* config/avr/avr.md: Same.
* config/avr/avr-log.cc: Same.
* config/avr/avr-protos.h: Same.
* config/avr/avr.opt: Same.
* config/avr/avr.opt.urls: Same.
gcc/testsuite/
* gcc.target/avr/torture/pr118591-1.c: Remove -mlra.
* gcc.target/avr/torture/pr118591-2.c: Same.
|
|
There are many post-reload define_insn_and_split's that just append
a (clobber (reg:CC REG_CC)) to the pattern. Instead of repeating
the original patterns, avr_add_ccclobber (curr_insn) is used to do
that job.
This avoids repeating patterns all over the place, and splits that do
something different (like using a canonical form) stand out clearly.
gcc/
* config/avr/avr.md (define_insn_and_split) [reload_completed]:
For splits that just append a (clobber (reg:CC REG_CC)) to
the pattern, use avr_add_ccclobber (curr_insn) instead of
repeating the original pattern.
* config/avr/avr-dimode.md: Same.
* config/avr/avr-fixed.md: Same.
|
|
gcc/
* config/avr/avr.cc (avr_add_ccclobber): New function.
* config/avr/avr-protos.h (avr_add_ccclobber): New proto.
(DONE_ADD_CCC): New define.
|
|
PR rtl-optimization 121340
gcc/
* config/avr/avr.opt.urls (-mfuse-move2): Add url.
|
|
gcc/
* config/avr/avr.cc (avr_output_addr_vec) <labl>: Asm out its .type.
|
|
insn combine.
Insn combine may come up with superfluous reg-reg moves, where the combine
people say that these are no problem since reg-alloc is supposed to optimize
them. The issue is that the lower-subreg pass sitting between combine and
reg-alloc may split such moves, coming up with a zoo of subregs which are
only handled poorly by the register allocator.
This patch adds a new avr mini-pass that handles such cases.
As an example, take
int f_ffssi (long x)
{
return __builtin_ffsl (x);
}
where the two functions have the same interface, i.e. there are no extra
moves required for the argument or for the return value. However,
$ avr-gcc -S -Os -dp -mno-fuse-move ...
f_ffssi:
mov r20,r22 ; 29 [c=4 l=1] movqi_insn/0
mov r21,r23 ; 30 [c=4 l=1] movqi_insn/0
mov r22,r24 ; 31 [c=4 l=1] movqi_insn/0
mov r23,r25 ; 32 [c=4 l=1] movqi_insn/0
mov r25,r23 ; 33 [c=4 l=4] *movsi/0
mov r24,r22
mov r23,r21
mov r22,r20
rcall __ffssi2 ; 34 [c=16 l=1] *ffssihi2.libgcc
ret ; 37 [c=0 l=1] return
where all the moves add up to a no-op. The -mno-fuse-move option
stops any attempts by the avr backend to clean up that mess.
PR rtl-optimization/121340
gcc/
* config/avr/avr.opt (-mfuse-move2): New option.
* config/avr/avr-passes.def (avr_pass_2moves): Insert after combine.
* config/avr/avr-passes.cc (make_avr_pass_2moves): New function.
(pass_data avr_pass_data_2moves): New static variable.
(avr_pass_2moves): New rtl_opt_pass.
* config/avr/avr-protos.h (make_avr_pass_2moves): New proto.
* common/config/avr/avr-common.cc
(default_options avr_option_optimization_table) <-mfuse-move2>:
Set for -O1 and higher.
* doc/invoke.texi (AVR Options) <-mfuse-move2>: Document.
|
|
Converting from generic AS to __flashx used the same rule like
for __memx, which tags RAM (generic AS) locations by setting bit 23.
The justification was that generic isn't a subset of __flashx, though
that lead to surprises with code like const __flashx *x = NULL.
The natural thing to do is to just load 0x000000 in that case,
so that the null pointer works in __flashx as expected.
Apart from that, converting NULL to __flashx (or __flash) no more
raises a -Waddr-space-convert diagnostic.
gcc/
PR target/121277
* config/avr/avr.cc (avr_addr_space_convert): When converting
from generic AS to __flashx, don't set bit 23.
(avr_convert_to_type): Don't -Waddr-space-convert when NULL
is converted to __flashx or to __flash.
|
|
gcc/
* config/avr/avr-passes.cc (avr_optimize_casesi): Fuse
get_insns() with end_sequence().
|
|
gcc/
* config/avr/avr-mcus.def: -mmcu= takes lower case MCU names.
* doc/avr-mmcu.texi: Rebuild.
|
|
gcc/
* config/avr/avr-mcus.def (avr32da28S, avr32da32S, avr32da48S)
(avr64da28S, avr64da32S, avr64da48S avr64da64S)
(avr128da28S, avr128da32S, avr128da48S, avr128da64S): Add devices.
* doc/avr-mmcu.texi: Rebuild.
|
|
This fixes an ICE with -mno-lra when split2 tries to split the following
zero_extendsidi2 insn:
(set (reg:DI 24)
(zero_extend:DI (reg:SI **)))
The ICE is because avr_hard_regno_mode_ok allows R24:DI but disallows
R28:SI when Reload is used. R28:SI is a result of zero_extendsidi2.
This ICE only occurs with Reload (which will die before very long),
but it occurs when building libgcc.
gcc/
PR target/120856
* config/avr/avr.cc (avr_hard_regno_mode_ok) [-mno-lra]: Deny
hard regs >= 4 bytes that overlap Y.
|
|
Now that the patches for PR120424 are upstream, the last known bug
associated with avr+lra has been fixed: PR118591. So we can pull the
switch that turns on LRA per default.
This patch only sets -mlra per default. It doesn't do any Reload related
cleanup or removal from the avr backend, hence -mno-lra still works.
The only new problem is that gcc.dg/torture/pr64088.c fails with LRA
but not with Reload. Though that test case is awkward since it is UB
but expects the compiler to behave in a specific way which avr-gcc
doesn't do: PR116780.
This patch also avoids a relative recent ICE that breaks building libgcc:
R24:DI is allowed per hard_regno_mode_ok, but R26:SI is disallowed
for Reload for old reasons. Outcome is that a split2 pattern for
R24:DI = zero_extend:DI (R22:SI) runs into an ICE.
AVR-LibC builds fine with this patch.
The AVR-LibC testsuite passes without errors.
gcc/
PR target/113934
* config/avr/avr.opt (-mlra): Turn on per default.
|
|
The problem with PR120423 and PR116389 is that reload might assign an invalid
hard register to a paradoxical subreg. For example with the test case from
the PR, it assigns (REG:QI 31) to the inner of (subreg:HI (QI) 0) which is
valid, but the subreg will be turned into (REG:HI 31) which is invalid
and triggers an ICE in postreload.
The problem only occurs with the old reload pass.
The patch maps the paradoxical subregs to a zero-extends which will be
allocated correctly. For the 120423 testcases, the code is the same like
with -mlra (which doesn't implement the fix), so the patch doesn't even
introduce a performance penalty.
The patch is only needed for v15: v14 is not affected, and in v16 reload
will be removed.
PR rtl-optimization/120423
PR rtl-optimization/116389
gcc/
* config/avr/avr.md [-mno-lra]: Add pre-reload split to transform
(left shift of) a paradoxical subreg to a (left shift of) zero-extend.
gcc/testsuite/
* gcc.target/avr/torture/pr120423-1.c: New test.
* gcc.target/avr/torture/pr120423-2.c: New test.
* gcc.target/avr/torture/pr120423-116389.c: New test.
(cherry picked from commit 61789b5abec3079d02ee9eaa7468015ab1f6f701)
|
|
This is the result of using a regexp to replace:
rtx( |_insn *)<stuff> = end_sequence ();
return <stuff>;
with:
return end_sequence ();
gcc/
* asan.cc (asan_emit_allocas_unpoison): Directly return the
result of end_sequence.
(hwasan_emit_untag_frame): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
* config/arm/arm.cc (arm_call_tls_get_addr): Likewise.
* config/avr/avr-passes.cc (avr_parallel_insn_from_insns): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.
|
|
This is the result of using a regexp to replace instances of:
<stuff> = get_insns ();
end_sequence ();
with:
<stuff> = end_sequence ();
where the indentation is the same for both lines, and where there
might be blank lines inbetween.
gcc/
* asan.cc (asan_clear_shadow): Use the return value of end_sequence,
rather than calling get_insns separately.
(asan_emit_stack_protection, asan_emit_allocas_unpoison): Likewise.
(hwasan_frame_base, hwasan_emit_untag_frame): Likewise.
* auto-inc-dec.cc (attempt_change): Likewise.
* avoid-store-forwarding.cc (process_store_forwarding): Likewise.
* bb-reorder.cc (fix_crossing_unconditional_branches): Likewise.
* builtins.cc (expand_builtin_apply_args): Likewise.
(expand_builtin_return, expand_builtin_mathfn_ternary): Likewise.
(expand_builtin_mathfn_3, expand_builtin_int_roundingfn): Likewise.
(expand_builtin_int_roundingfn_2, expand_builtin_saveregs): Likewise.
(inline_string_cmp): Likewise.
* calls.cc (expand_call): Likewise.
* cfgexpand.cc (expand_asm_stmt, pass_expand::execute): Likewise.
* cfgloopanal.cc (init_set_costs): Likewise.
* cfgrtl.cc (insert_insn_on_edge, prepend_insn_to_edge): Likewise.
(rtl_lv_add_condition_to_bb): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
(aarch64_do_track_speculation): Likewise.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately)
(aarch64_expand_vector_init, aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next, aarch64_mode_emit): Likewise.
(aarch64_md_asm_adjust): Likewise.
(aarch64_switch_pstate_sm_for_landing_pad): Likewise.
(aarch64_switch_pstate_sm_for_jump): Likewise.
(aarch64_switch_pstate_sm_for_call): Likewise.
* config/alpha/alpha.cc (alpha_legitimize_address_1): Likewise.
(alpha_emit_xfloating_libcall, alpha_gp_save_rtx): Likewise.
* config/arc/arc.cc (hwloop_optimize): Likewise.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm-builtins.cc: Likewise.
* config/arm/arm.cc (require_pic_register): Likewise.
(arm_call_tls_get_addr, arm_gen_load_multiple_1): Likewise.
(arm_gen_store_multiple_1, cmse_clear_registers): Likewise.
(cmse_nonsecure_call_inline_register_clear): Likewise.
(arm_attempt_dlstp_transform): Likewise.
* config/avr/avr-passes.cc (bbinfo_t::optimize_one_block): Likewise.
(avr_parallel_insn_from_insns): Likewise.
* config/avr/avr.cc (avr_prologue_setup_frame): Likewise.
(avr_expand_epilogue): Likewise.
* config/bfin/bfin.cc (hwloop_optimize): Likewise.
* config/c6x/c6x.cc (c6x_expand_compare): Likewise.
* config/cris/cris.cc (cris_split_movdx): Likewise.
* config/cris/cris.md: Likewise.
* config/csky/csky.cc (csky_call_tls_get_addr): Likewise.
* config/epiphany/resolve-sw-modes.cc
(pass_resolve_sw_modes::execute): Likewise.
* config/fr30/fr30.cc (fr30_move_double): Likewise.
* config/frv/frv.cc (frv_split_scc, frv_split_cond_move): Likewise.
(frv_split_minmax, frv_split_abs): Likewise.
* config/frv/frv.md: Likewise.
* config/gcn/gcn.cc (move_callee_saved_registers): Likewise.
(gcn_expand_prologue, gcn_restore_exec, gcn_md_reorg): Likewise.
* config/i386/i386-expand.cc
(ix86_expand_carry_flag_compare, ix86_expand_int_movcc): Likewise.
(ix86_vector_duplicate_value, expand_vec_perm_interleave2): Likewise.
(expand_vec_perm_vperm2f128_vblend): Likewise.
(expand_vec_perm_2perm_interleave): Likewise.
(expand_vec_perm_2perm_pblendv): Likewise.
(expand_vec_perm2_vperm2f128_vblend, ix86_gen_ccmp_first): Likewise.
(ix86_gen_ccmp_next): Likewise.
* config/i386/i386-features.cc
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_reg, scalar_chain::convert_op): Likewise.
(timode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.cc (ix86_init_pic_reg, ix86_va_start): Likewise.
(ix86_get_drap_rtx, legitimize_tls_address): Likewise.
(ix86_md_asm_adjust): Likewise.
* config/ia64/ia64.cc (ia64_expand_tls_address): Likewise.
(ia64_expand_compare, spill_restore_mem): Likewise.
(expand_vec_perm_interleave_2): Likewise.
* config/loongarch/loongarch.cc
(loongarch_call_tls_get_addr): Likewise.
* config/m32r/m32r.cc (gen_split_move_double): Likewise.
* config/m32r/m32r.md: Likewise.
* config/m68k/m68k.cc (m68k_call_tls_get_addr): Likewise.
(m68k_call_m68k_read_tp, m68k_sched_md_init_global): Likewise.
* config/m68k/m68k.md: Likewise.
* config/microblaze/microblaze.cc
(microblaze_call_tls_get_addr): Likewise.
* config/mips/mips.cc (mips_call_tls_get_addr): Likewise.
(mips_ls2_init_dfa_post_cycle_insn): Likewise.
(mips16_split_long_branches): Likewise.
* config/nvptx/nvptx.cc (nvptx_gen_shuffle): Likewise.
(nvptx_gen_shared_bcast, nvptx_propagate): Likewise.
(workaround_uninit_method_1, workaround_uninit_method_2): Likewise.
(workaround_uninit_method_3): Likewise.
* config/or1k/or1k.cc (or1k_init_pic_reg): Likewise.
* config/pa/pa.cc (legitimize_tls_address): Likewise.
* config/pru/pru.cc (pru_expand_fp_compare, pru_reorg_loop): Likewise.
* config/riscv/riscv-shorten-memrefs.cc
(pass_shorten_memrefs::transform): Likewise.
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Likewise.
* config/riscv/riscv.cc (riscv_call_tls_get_addr): Likewise.
(riscv_frm_emit_after_bb_end): Likewise.
* config/rl78/rl78.cc (rl78_emit_libcall): Likewise.
* config/rs6000/rs6000.cc (rs6000_debug_legitimize_address): Likewise.
* config/s390/s390.cc (legitimize_tls_address): Likewise.
(s390_two_part_insv, s390_load_got, s390_va_start): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* config/sparc/sparc.cc (sparc_legitimize_tls_address): Likewise.
(sparc_output_mi_thunk, sparc_init_pic_reg): Likewise.
* config/stormy16/stormy16.cc (xstormy16_split_cbranch): Likewise.
* config/xtensa/xtensa.cc (xtensa_copy_incoming_a7): Likewise.
(xtensa_expand_block_set_libcall): Likewise.
(xtensa_expand_block_set_unrolled_loop): Likewise.
(xtensa_expand_block_set_small_loop, xtensa_call_tls_desc): Likewise.
* dse.cc (emit_inc_dec_insn_before, find_shift_sequence): Likewise.
(replace_read): Likewise.
* emit-rtl.cc (reorder_insns, gen_clobber, gen_use): Likewise.
* except.cc (dw2_build_landing_pads, sjlj_mark_call_sites): Likewise.
(sjlj_emit_function_enter, sjlj_emit_function_exit): Likewise.
(sjlj_emit_dispatch_table): Likewise.
* expmed.cc (expmed_mult_highpart_optab, expand_sdiv_pow2): Likewise.
* expr.cc (convert_mode_scalar, emit_move_multi_word): Likewise.
(gen_move_insn, expand_cond_expr_using_cmove): Likewise.
(expand_expr_divmod, expand_expr_real_2): Likewise.
(maybe_optimize_pow2p_mod_cmp, maybe_optimize_mod_cmp): Likewise.
* function.cc (emit_initial_value_sets): Likewise.
(instantiate_virtual_regs_in_insn, expand_function_end): Likewise.
(get_arg_pointer_save_area, make_split_prologue_seq): Likewise.
(make_prologue_seq, gen_call_used_regs_seq): Likewise.
(thread_prologue_and_epilogue_insns): Likewise.
(match_asm_constraints_1): Likewise.
* gcse.cc (prepare_copy_insn): Likewise.
* ifcvt.cc (noce_emit_store_flag, noce_emit_move_insn): Likewise.
(noce_emit_cmove): Likewise.
* init-regs.cc (initialize_uninitialized_regs): Likewise.
* internal-fn.cc (expand_POPCOUNT): Likewise.
* ira-emit.cc (emit_move_list): Likewise.
* ira.cc (ira): Likewise.
* loop-doloop.cc (doloop_modify): Likewise.
* loop-unroll.cc (compare_and_jump_seq): Likewise.
(unroll_loop_runtime_iterations, insert_base_initialization): Likewise.
(split_iv, insert_var_expansion_initialization): Likewise.
(combine_var_copies_in_loop_exit): Likewise.
* lower-subreg.cc (resolve_simple_move,resolve_shift_zext): Likewise.
* lra-constraints.cc (match_reload, check_and_process_move): Likewise.
(process_addr_reg, insert_move_for_subreg): Likewise.
(process_address_1, curr_insn_transform): Likewise.
(inherit_reload_reg, process_invariant_for_inheritance): Likewise.
(inherit_in_ebb, remove_inheritance_pseudos): Likewise.
* lra-remat.cc (do_remat): Likewise.
* mode-switching.cc (commit_mode_sets): Likewise.
(optimize_mode_switching): Likewise.
* optabs.cc (expand_binop, expand_twoval_binop_libfunc): Likewise.
(expand_clrsb_using_clz, expand_doubleword_clz_ctz_ffs): Likewise.
(expand_doubleword_popcount, expand_ctz, expand_ffs): Likewise.
(expand_absneg_bit, expand_unop, expand_copysign_bit): Likewise.
(prepare_float_lib_cmp, expand_float, expand_fix): Likewise.
(expand_fixed_convert, gen_cond_trap): Likewise.
(expand_atomic_fetch_op): Likewise.
* ree.cc (combine_reaching_defs): Likewise.
* reg-stack.cc (compensate_edge): Likewise.
* reload1.cc (emit_input_reload_insns): Likewise.
* sel-sched-ir.cc (setup_nop_and_exit_insns): Likewise.
* shrink-wrap.cc (emit_common_heads_for_components): Likewise.
(emit_common_tails_for_components): Likewise.
(insert_prologue_epilogue_for_components): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.
(insert_value_copy_on_edge): Likewise.
* tree-ssa-loop-ivopts.cc (computation_cost): Likewise.
|
|
libgcc's __xload_1...4 is clobbering Z (and also R21 is some cases),
but avr.md had clobbers of respective GPRs only up to reload.
Outcome was that code reading from the same __memx address twice
could be wrong. This patch adds respective clobbers.
Forward-port from 2025-04-30 r14-11703
PR target/119989
gcc/
* config/avr/avr.md (xload_<mode>_libgcc): Clobber R21, Z.
gcc/testsuite/
* gcc.target/avr/torture/pr119989.h: New file.
* gcc.target/avr/torture/pr119989-memx-1.c: New test.
* gcc.target/avr/torture/pr119989-memx-2.c: New test.
* gcc.target/avr/torture/pr119989-memx-3.c: New test.
* gcc.target/avr/torture/pr119989-memx-4.c: New test.
* gcc.target/avr/torture/pr119989-flashx-1.c: New test.
* gcc.target/avr/torture/pr119989-flashx-2.c: New test.
* gcc.target/avr/torture/pr119989-flashx-3.c: New test.
* gcc.target/avr/torture/pr119989-flashx-4.c: New test.
(cherry picked from commit 1ca1c1fc3b58ae5e1d3db4f5a2014132fe69f82a)
|
|
gcc/
* config/avr/avr-mcus.def: Add AVR32SD20, AVR32SD28, AVR32SD32,
AVR64SD28, AVR64SD32, AVR64SD48.
* doc/avr-mmcu.texi: Rebuild.
|
|
This patch uses a name for the dump file that makes it clear where
in the pass chain the 2nd run of peephole2 is located.
gcc/
* config/avr/avr.cc (avr_option_override): Use
"avr-peep2-after-fuse-move" as dump name instead of "peephole2".
|
|
gcc/
* config/avr/avr.opt.urls: Add -muse-nonzero-bits.
|
|
There are occasions where knowledge about nonzero bits makes some
optimizations possible. For example,
Rd |= Rn << Off
can be implemented as
SBRC Rn, 0
ORI Rd, 1 << Off
when Rn in { 0, 1 }, i.e. nonzero_bits (Rn) == 1. This patch adds some
patterns that exploit nonzero_bits() in some combiner patterns.
As insn conditions are not supposed to contain nonzero_bits(), the patch
splits such insns right after pass insn combine.
PR target/119421
gcc/
* config/avr/avr.opt (-muse-nonzero-bits): New option.
* config/avr/avr-protos.h (avr_nonzero_bits_lsr_operands_p): New.
(make_avr_pass_split_nzb): New.
* config/avr/avr.cc (avr_nonzero_bits_lsr_operands_p): New function.
(avr_rtx_costs_1): Return costs for the new insns.
* config/avr/avr.md (nzb): New insn attribute.
(*nzb=1.<code>...): New insns to better support some bit
operations for <code> in AND, IOR, XOR.
* config/avr/avr-passes.def (avr_pass_split_nzb): Insert pass
atfer combine.
* config/avr/avr-passes.cc (avr_pass_data_split_nzb). New pass data.
(avr_pass_split_nzb): New pass.
(make_avr_pass_split_nzb): New function.
* common/config/avr/avr-common.cc (avr_option_optimization_table):
Enable -muse-nonzero-bits for -O2 and higher.
* doc/invoke.texi (AVR Options): Document -muse-nonzero-bits.
gcc/testsuite/
* gcc.target/avr/torture/pr119421-sreg.c: New test.
|
|
Code in .initN and .initN sections is never called since these
sections are special and part of the startup resp. shutdown code.
This patch adds attribute "used" so they won't be optimized out.
gcc/
* config/avr/avr.cc (avr_attrs_section_name): New function.
(avr_insert_attributes): Add "used" attribute to functions
in .initN and .finiN.
|
|
This ICE only occurred when the compiler is built with, say
CXXFLAGS='-Wp,-D_GLIBCXX_ASSERTIONS'. The problem was that
a value from an illegal REGNO was read. The value was not
used in these cases, but the access triggered an assertion
due to reading past std::array.
gcc/
PR target/119355
* config/avr/avr-passes.cc (memento_t::apply): Only
read values[p.arg] when it is actually used.
|
|
As can be seen in gcc/po/gcc.pot:
#: config/avr/avr.cc:2754
#, c-format
msgid "bad I/O address 0x"
msgstr ""
exgettext couldn't retrieve the whole format string in this case,
because it uses a macro in the middle. output_operand_lossage
is c-format function though, so we can't use %wx to print HOST_WIDE_INT,
and HOST_WIDE_INT_PRINT_HEX_PURE is on some hosts %lx, on others %llx
and on others %I64x so isn't really translatable that way.
As Joseph mentioned in the PR, there is no easy way around this
but go through a temporary buffer, which the following patch does.
2025-03-02 Jakub Jelinek <jakub@redhat.com>
PR translation/118991
* config/avr/avr.cc (avr_print_operand): Print ival into
a temporary buffer and use %s in output_operand_lossage to make
the diagnostics translatable.
|
|
gcc/
PR target/118764
* config/avr/gen-avr-mmcu-specs.cc (print_mcu)
[has CVT]: Mention CVT in header comment of generated specs file.
|
|
gcc/
PR target/118764
* config/avr/avr-c.cc (avr_cpu_cpp_builtins)
[TARGET_CVT]: Define __AVR_CVT__.
* doc/invoke.texi (AVR Built-in Macros): Document __AVR_CVT__.
|
|
When REG_UNUSED notes indicate that some result bytes are not
used by the following code, then there's no need to asm out them.
The patch uses such notes for the asm out of AND, IOR, XOR, PLUS, MINUS.
gcc/
* config/avr/avr.cc (avr_result_regno_unused_p): New static function.
(avr_out_bitop): Only output result bytes that are used.
(avr_out_plus_1): Same.
|
|
This patch executes avr_builtin_supported_p at a later time and in
avr_resolve_overloaded_builtin. This allows for better diagnostics
and avoids lto1 hiccups when a built-in decl is NULL_TREE.
gcc/
* config/avr/avr-protos.h (avr_builtin_supported_p): Remove.
* config/avr/avr.cc (avr_init_builtins): Don't initialize
non-available built-ins with NULL_TREE.
(avr_builtin_supported_p): Move to...
* config/avr/avr-c.cc: ...here.
(avr_resolve_overloaded_builtin): Run avr_builtin_supported_p.
|
|
After register allocation, paradoxical subregs may become something
like r20:SI += r22:SI which doesn't make much sense as assembly code.
Hence avr_out_plus_1() used to ICE on such code. However, paradoxical
subregs appear to be a common optimization device (instead of proper
mode demotion).
PR target/118878
gcc/
* config/avr/avr.cc (avr_out_plus_1): Don't ICE on result of
paradoxical reg's register allocation.
gcc/testsuite/
* gcc.target/avr/torture/pr118878.c: New test.
|
|
gcc/
* config/avr/avr.opt.urls: Add -mcall-main.
|
|
On devices with very limited resources, it may be desirable to run
main in a more efficient way than provided by the startup code
XCALL main
XJMP exit
from section .init9. In AVR-LibC v2.3, that code has been moved to
libmcu.a, hence symbol __call_main can be satisfied so that the
respective code is no more pulled in from that library.
Instead, main can be run by putting it in section .init9.
The patch adds attributes noreturn and section(".init9"), and
sets __call_main=0 when it encounters main().
gcc/
PR target/118806
* config/avr/avr.opt (-mcall-main): New option and...
(avropt_call_main): ...variable.
* config/avr/avr.cc (avr_no_call_main_p): New variable.
(avr_insert_attributes) [-mno-call-main, main]: Add attributes
noreturn and section(".init9") to main. Set avr_no_call_main_p.
(avr_file_end) [avr_no_call_main_p]: Define symbol __call_main.
* doc/invoke.texi (AVR Options) <-mno-call-main>: Document.
<-mnodevicelib>: Extend explanation.
|
|
gcc/
* config/avr/avr.opt.urls: Add mcvt.
|
|
Some AVR devices support a CVT:
- Devices from the 0-series, 1-series, 2-series.
- AVR16, AVR32, AVR64, AVR128 devices.
The support is provided by means of a startup code file
crt<mcu>-cvt.o from AVR-LibC v2.3 that can be linked instead
of the traditional crt<mcu>.o.
This patch adds a new command line option -mcvt that links
that CVT startup code (or issues an error when the device
doesn't support a CVT).
PR target/118764
gcc/
* config/avr/avr.opt (-mcvt): New target option.
* config/avr/avr-arch.h (AVR_CVT): New enum value.
* config/avr/avr-mcus.def: Add AVR_CVT flag for devices that
support it.
* config/avr/avr.cc (avr_handle_isr_attribute) [TARGET_CVT]: Issue
an error when a vector number larger that 3 is used.
* config/avr/gen-avr-mmcu-specs.cc (McuInfo.have_cvt): New property.
(print_mcu) <*avrlibc_startfile>: Use crt<mcu>-cvt.o depending
on -mcvt (or issue an error when the device doesn't support a CVT).
* doc/invoke.texi (AVR Options): Document -mcvt.
|
|
gcc/
PR target/118768
* config/avr/genmultilib.awk: Parse the AVR_MCU lines in
a more robust way w.r.t. white spaces.
|
|
This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.
gcc/
* config/avr/builtins.def (STRLEN_FLASH, STRLEN_FLASHX)
(STRLEN_MEMX): New DEF_BUILTIN's.
* config/avr/avr.cc (avr_ftype_strlen): New static function.
(avr_builtin_supported_p): New built-ins are not for AVR_TINY.
(avr_init_builtins) <strlen_flash_node, strlen_flashx_node,
strlen_memx_node>: Provide new fntypes.
(avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH]
[AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if
possible.
* doc/extend.texi (AVR Built-in Functions): Document
__builtin_avr_strlen_flash, __builtin_avr_strlen_flashx,
__builtin_avr_strlen_memx.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx.
* config/avr/lib1funcs.S <L_strlen_memx, __strlen_memx>: Implement.
|
|
Some built-ins are not available for C++ since they are using
named address-spaces or fixed-point types.
gcc/
* config/avr/builtins.def (AVR_FIRST_C_ONLY_BUILTIN_ID): New macro.
* config/avr/avr-protos.h (avr_builtin_supported_p): New.
* config/avr/avr.cc (avr_builtin_supported_p): New function.
(avr_init_builtins): Only provide a built-in when it is supported.
* config/avr/avr-c.cc (avr_cpu_cpp_builtins): Only define the
__BUILTIN_AVR_<NAME> build-in defines when the associated built-in
function is supported.
* doc/extend.texi (AVR Built-in Functions): Add a note that
following built-ins are supported for only for GNU-C.
|
|
libgcc has a module for __negsi2: REG_22:SI := - REG_22:SI.
This patch adds a pattern that allows to share that function
provided optimize_size.
gcc/
* config/avr/avr.md (*negsi2.libgcc): New insn.
|
|
This patch tries to work around PR118012 which may use a
full fledged multiplication instead of a simple bit test.
This is because match.pd's
/* (zero_one == 0) ? y : z <op> y -> ((typeof(y))zero_one * z) <op> y */
/* (zero_one != 0) ? z <op> y : y -> ((typeof(y))zero_one * z) <op> y */
"optimizes" code with op in { plus, ior, xor } like
if (a & 1)
b = b <op> c;
to something like:
x1 = EXTRACT_BIT0 (a);
x2 = c MULT x1;
b = b <op> x2;
or
x1 = EXTRACT_BIT0 (a);
x2 = ZERO_EXTEND (x1);
x3 = NEG x2;
x4 = a AND x3:
b = b <op> x4;
which is very expensive and may even result in a libgcc call for
a 32-bit multiplication on devices that don't even have MUL.
Notice that EXTRACT_BIT0 is already more expensive (slower, more
code, more register pressure) than a bit-test + branch.
The patch:
o Adds some combiner patterns that try to map sick code back
to a bit test + branch.
o Adjusts costs to make MULT (x AND 1) cheap, in the hope that the
middle-end will use that alternative (which we map to sane code).
o On devices without MUL, 32-bit multiplication was performed by a
library call, which bypasses the MULT (x AND 1) and similar patterns.
Therefore, mulsi3 is also allowed for devices without MUL so that
we get at MULT pattern that can be transformed. (Though this is
not possible on AVR_TINY since it passes arguments on the stack).
o Add a new command line option -mpr118012, so most of the patterns
and cost computations can be switched off as they have
avropt_pr118012 in their insn condition.
o Added sign-extract.0 patterns unconditionally (no avropt_pr118012).
Notice that this patch is just a work-around, it's not a fix of the
root cause, which are the patterns in match.pd that don't care about
the target and don't even care about costs.
The work-around is incomplete, and 3 of the new tests are still failing.
This is because there are situations where it does not work:
* The MULT is realized as a library call.
* The MULT is realized as an ASHIFT, and the ASHIFT again is transformed
into something else. For example, with -O2 -mmcu=atmega128,
ASHIFT(3) is transformed into ASHIFT(1) + ASHIFT(2).
PR tree-optimization/118012
PR tree-optimization/118360
gcc/
* config/avr/avr.opt (-mpr118012): New undocumented option.
* config/avr/avr-protos.h (avr_out_sextr)
(avr_emit_skip_pixop, avr_emit_skip_clear): New protos.
* config/avr/avr.cc (avr_adjust_insn_length)
[case ADJUST_LEN_SEXTR]: Handle case.
(avr_rtx_costs_1) [NEG]: Costs for NEG (ZERO_EXTEND (ZERO_EXTRACT)).
[MULT && avropt_pr118012]: Costs for MULT (x AND 1).
(avr_out_sextr, avr_emit_skip_pixop, avr_emit_skip_clear): New
functions.
* config/avr/avr.md [avropt_pr118012]: Add combine patterns with
that condition that try to work around PR118012.
(adjust_len) <sextr>: Add insn attr value.
(pixop): New code iterator.
(mulsi3) [avropt_pr118012 && !AVR_TINY]: Allow these in insn condition.
gcc/testsuite/
* gcc.target/avr/mmcu/pr118012-1.h: New file.
* gcc.target/avr/mmcu/pr118012-1-o2-m128.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-m128.c: New test.
* gcc.target/avr/mmcu/pr118012-1-o2-m103.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-m103.c: New test.
* gcc.target/avr/mmcu/pr118012-1-o2-t40.c: New test.
* gcc.target/avr/mmcu/pr118012-1-os-t40.c: New test.
* gcc.target/avr/mmcu/pr118360-1.h: New file.
* gcc.target/avr/mmcu/pr118360-1-o2-m128.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-m128.c: New test.
* gcc.target/avr/mmcu/pr118360-1-o2-m103.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-m103.c: New test.
* gcc.target/avr/mmcu/pr118360-1-o2-t40.c: New test.
* gcc.target/avr/mmcu/pr118360-1-os-t40.c: New test.
|
|
As it turns out, logical 32-bit shifts with an offset of 25..30 can
be performed in 7 instructions or less. This beats the 7 instruc-
tions required for the default code of a shift loop.
Plus, with zero overhead, these cases can be 3-operand.
This is only relevant for -Oz because with -Os, 3op shifts are
split with -msplit-bit-shift (which is not performed with -Oz).
PR target/117726
gcc/
* config/avr/avr.cc (avr_ld_regno_p): New function.
(ashlsi3_out) [case 25,26,27,28,29,30]: Handle and tweak.
(lshrsi3_out): Same.
(avr_rtx_costs_1) [SImode, ASHIFT, LSHIFTRT]: Adjust costs.
* config/avr/avr.md (ashlsi3, *ashlsi3, *ashlsi3_const):
Add "r,r,C4L" alternative.
(lshrsi3, *lshrsi3, *lshrsi3_const): Add "r,r,C4R" alternative.
* config/avr/constraints.md (C4R, C4L): New,
gcc/testsuite/
* gcc.target/avr/torture/avr-torture.exp (AVR_TORTURE_OPTIONS):
Turn one option variant into -Oz.
|
|
u16 << 5 and u16 << 6 can be tweaked by using MUL instructions.
Benefit is a better speed ratio with -Os and smaller size with -O2.
gcc/
* config/avr/avr-passes.cc (avr_emit_shift) [ASHIFT,HImode]:
Allow offsets 5 and 6 as 3op provided have MUL and a scratch.
* config/avr/avr.cc (avr_optimize_size_max_p): New function.
(avr_out_ashlhi3_mul): New function.
(ashlhi3_out) [case 4, 5, 6]: Better speed for -Os.
* config/avr/avr.md (isa) <mul, no_mul>: New attr values.
(*ashlhi3_const): Add alternative for offsets 5 and 6.
|
|
gcc/
* config/avr/avr-c.cc (DEF_BUILTIN): Add ATTRS argument to macro
definition.
* config/avr/avr.cc: Same.
(avr_init_builtins) <attr_const>: New variable that can be used
as ATTRS argument in DEF_BUILTIN.
* config/avr/builtins.def (DEF_BUILTIN): Add ATTRS parameter
to all definitions.
|
|
This patch uses the INT_N interface to define __int24 in avr-modes.def.
Since the testsuite uses -Wpedantic and __int24 is a C/C++ extension,
uses of __int24 and __uint24 is now marked as __extension__.
PR target/118329
gcc/
* config/avr/avr-modes.def: Add INT_N (PSI, 24).
* config/avr/avr.cc (avr_init_builtin_int24)
<__int24>: Remove definition.
<__uint24>: Adjust definition to INT_N interface.
gcc/testsuite/
* gcc.target/avr/pr115830-add.c (__int24, __uint24): Add __extension__
to respective typedefs.
* gcc.target/avr/pr115830-sub-ext.c: Same.
* gcc.target/avr/pr115830-sub.c: Same.
* gcc.target/avr/torture/get-mem.c: Same.
* gcc.target/avr/torture/set-mem.c: Same.
* gcc.target/avr/torture/ifelse-c.h: Same.
* gcc.target/avr/torture/ifelse-d.h: Same.
* gcc.target/avr/torture/ifelse-q.h: Same.
* gcc.target/avr/torture/ifelse-r.h: Same.
* gcc.target/avr/torture/int24-mul.c: Same.
* gcc.target/avr/torture/pr109907-2.c: Same.
* gcc.target/avr/torture/pr61443.c: Same.
* gcc.target/avr/torture/pr63633-ice-mult.c: Same.
* gcc.target/avr/torture/shift-l-u24.c: Same.
* gcc.target/avr/torture/shift-r-i24.c: Same.
* gcc.target/avr/torture/shift-r-u24.c: Same.
* gcc.target/avr/torture/add-extend.c: Same.
* gcc.target/avr/torture/sub-extend.c: Same.
* gcc.target/avr/torture/sub-zerox.c: Same.
* gcc.target/avr/torture/test-gprs.h: Same.
|
|
|
|
* rampz_rtx et al. were missing MEM_VOLATILE_P. This is needed because
avr_emit_cpymemhi is setting RAMPZ explicitly with an own insn.
* avr_out_cpymem was missing a final RAMPZ = 0 on EBI devices.
This only affects the __flash1 ... __flash5 spaces since the other ASes
use different routines,
gcc/
PR target/118000
* config/avr/avr.cc (avr_init_expanders) <sreg_rtx>
<rampd_rtx, rampx_rtx, rampy_rtx, rampz_rtx>: Set MEM_VOLATILE_P.
(avr_out_cpymem) [ELPM && EBI]: Restore RAMPZ to 0 after.
|
|
gcc/
* config/avr/avr.cc (avr_ctz): New constexpr function.
(section_common::flags): Assert minimal bit width.
|
|
This patch adds __flashx as a new named address space that allocates
objects in .progmemx.data. The handling is mostly the same or similar
to that of 24-bit space __memx, except that the asm routines are
simpler and more efficient. Loads are emit inline when ELPMX or
LPMX is available. The address space uses a 24-bit addresses even
on devices with a program memory size of 64 KiB or less.
PR target/118001
gcc/
* doc/extend.texi (AVR Named Address Spaces): Document __flashx.
* config/avr/avr.h (ADDR_SPACE_FLASHX): New enum value.
* config/avr/avr-protos.h (avr_out_fload, avr_mem_flashx_p)
(avr_fload_libgcc_p, avr_load_libgcc_mem_p)
(avr_load_libgcc_insn_p): New.
* config/avr/avr.cc (avr_addrspace): Add ADDR_SPACE_FLASHX.
(avr_decl_flashx_p, avr_mem_flashx_p, avr_fload_libgcc_p)
(avr_load_libgcc_mem_p, avr_load_libgcc_insn_p, avr_out_fload):
New functions.
(avr_adjust_insn_length) [ADJUST_LEN_FLOAD]: Handle case.
(avr_progmem_p) [avr_decl_flashx_p]: return 2.
(avr_addr_space_legitimate_address_p) [ADDR_SPACE_FLASHX]:
Has same behavior like ADDR_SPACE_MEMX.
(avr_addr_space_convert): Use pointer sizes rather then ASes.
(avr_addr_space_contains): New function.
(avr_convert_to_type): Use it.
(avr_emit_cpymemhi): Handle ADDR_SPACE_FLASHX.
* config/avr/avr.md (adjust_len) <fload>: New attr value.
(gen_load<mode>_libgcc): Renamed from load<mode>_libgcc.
(xload8<mode>_A): Iterate over MOVMODE rather than over ALL1.
(fxmov<mode>_A): New from xloadv<mode>_A.
(xmov<mode>_8): New from xload<mode>_A.
(fmov<mode>): New insns.
(fxload<mode>_A): New from xload<mode>_A.
(fxload_<mode>_libgcc): New from xload_<mode>_libgcc.
(*fxload_<mode>_libgcc): New from *xload_<mode>_libgcc.
(mov<mode>) [avr_mem_flashx_p]: Hande ADDR_SPACE_FLASHX.
(cpymemx_<mode>): Make sure the address space is not lost
when splitting.
(*cpymemx_<mode>) [ADDR_SPACE_FLASHX]: Use __movmemf_<mode> for asm.
(*ashlqi.1.zextpsi_split): New combine pattern.
* config/avr/predicates.md (nox_general_operand): Don't match
when avr_mem_flashx_p is true.
* config/avr/avr-passes.cc (AVR_LdSt_Props):
ADDR_SPACE_FLASHX has no post_inc.
gcc/testsuite/
* gcc.target/avr/torture/addr-space-1.h [AVR_HAVE_ELPM]:
Use a function to bump .progmemx.data to a high address.
* gcc.target/avr/torture/addr-space-2.h: Same.
* gcc.target/avr/torture/addr-space-1-fx.c: New test.
* gcc.target/avr/torture/addr-space-2-fx.c: New test.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _fload_1, _fload_2,
_fload_3, _fload_4, _movmemf.
* config/avr/lib1funcs.S (.branch_plus): New .macro.
(__xload_1, __xload_2, __xload_3, __xload_4): When the address is
located in flash, then forward to...
(__fload_1, __fload_2, __fload_3, __fload_4): ...these new
functions, respectively.
(__movmemx_hi): When the address is located in flash, forward to...
(__movmemf_hi): ...this new function.
|