riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
5 days	AVR: Support AVR32EB14/20/28/32.	Georg-Johann Lay	1	-0/+4
	Add support for some recent AVR devices. gcc/ * config/avr/avr-mcus.def: Add avr32eb14, avr32eb20, avr32eb28, avr32eb32. * doc/avr-mmcu.texi: Rebuild.
13 days	AVR: ad target/121794 - Invoke zero_reg less.	Georg-Johann Lay	1	-5/+5
	gcc/ PR target/121794 * config/avr/avr.md (cmpqi3): Use cpi R,0 if possible.
2025-09-05	AVR: target/121794 - Invoke zero_reg less.	Georg-Johann Lay	1	-27/+47
	There are some cases where involing zero_reg is not needed and where there are other sequences with the same efficiency. An example is to use SBCI R,0 instead of SBC R,__zero_reg__ when R >= R16. This may turn out to be better for small ISRs. PR target/121794 gcc/ * config/avr/avr.cc (avr_out_compare): Only use zero_reg when there is no other sequence of the same length. (avr_out_plus_ext): Same. (avr_out_plus_1): Same.
2025-08-20	AVR: target/121608 - Don't add --relax when linking with -r.	Georg-Johann Lay	1	-1/+1
	The linker rejects --relax in relocatable links (-r), hence only add --relax when -r is not specified. gcc/ PR target/121608 * config/avr/specs.h (LINK_RELAX_SPEC): Wrap in %{!r...}.
2025-08-05	AVR: Allow combination of sign_extend with ashift.	Georg-Johann Lay	2	-1/+46
	gcc/ * config/avr/avr.cc (avr_rtx_costs_1) [SIGN_EXTEND]: Adjust cost. * config/avr/avr.md (*sext.ashift<QIPSI:mode><HISI:mode>2): New insn and a cc split.
2025-08-05	AVR: target/121359: Remove -mlra and remains of reload.	Georg-Johann Lay	7	-180/+6
	PR target/121359 gcc/ * config/avr/avr.h: Remove -mlra and remains of reload. * config/avr/avr.cc: Same. * config/avr/avr.md: Same. * config/avr/avr-log.cc: Same. * config/avr/avr-protos.h: Same. * config/avr/avr.opt: Same. * config/avr/avr.opt.urls: Same. gcc/testsuite/ * gcc.target/avr/torture/pr118591-1.c: Remove -mlra. * gcc.target/avr/torture/pr118591-2.c: Same.
2025-08-03	AVR: Use avr_add_ccclobber / DONE_ADD_CCC in md instead of repeats.	Georg-Johann Lay	3	-961/+436
	There are many post-reload define_insn_and_split's that just append a (clobber (reg:CC REG_CC)) to the pattern. Instead of repeating the original patterns, avr_add_ccclobber (curr_insn) is used to do that job. This avoids repeating patterns all over the place, and splits that do something different (like using a canonical form) stand out clearly. gcc/ * config/avr/avr.md (define_insn_and_split) [reload_completed]: For splits that just append a (clobber (reg:CC REG_CC)) to the pattern, use avr_add_ccclobber (curr_insn) instead of repeating the original pattern. * config/avr/avr-dimode.md: Same. * config/avr/avr-fixed.md: Same.
2025-08-03	AVR: Add avr.cc::avr_add_ccclobber().	Georg-Johann Lay	2	-0/+25
	gcc/ * config/avr/avr.cc (avr_add_ccclobber): New function. * config/avr/avr-protos.h (avr_add_ccclobber): New proto. (DONE_ADD_CCC): New define.
2025-07-31	AVR: avr.opt.urls: Add -mfuse-move2	Georg-Johann Lay	1	-0/+3
	PR rtl-optimization 121340 gcc/ * config/avr/avr.opt.urls (-mfuse-move2): Add url.
2025-07-31	AVR: Set .type of jump table label.	Georg-Johann Lay	1	-0/+7
	gcc/ * config/avr/avr.cc (avr_output_addr_vec) <labl>: Asm out its .type.
2025-07-31	AVR: rtl-optimization/121340 - New mini-pass to undo superfluous moves from ↵	Georg-Johann Lay	4	-0/+152
	insn combine. Insn combine may come up with superfluous reg-reg moves, where the combine people say that these are no problem since reg-alloc is supposed to optimize them. The issue is that the lower-subreg pass sitting between combine and reg-alloc may split such moves, coming up with a zoo of subregs which are only handled poorly by the register allocator. This patch adds a new avr mini-pass that handles such cases. As an example, take int f_ffssi (long x) { return __builtin_ffsl (x); } where the two functions have the same interface, i.e. there are no extra moves required for the argument or for the return value. However, $ avr-gcc -S -Os -dp -mno-fuse-move ... f_ffssi: mov r20,r22 ; 29 [c=4 l=1] movqi_insn/0 mov r21,r23 ; 30 [c=4 l=1] movqi_insn/0 mov r22,r24 ; 31 [c=4 l=1] movqi_insn/0 mov r23,r25 ; 32 [c=4 l=1] movqi_insn/0 mov r25,r23 ; 33 [c=4 l=4] movsi/0 mov r24,r22 mov r23,r21 mov r22,r20 rcall __ffssi2 ; 34 [c=16 l=1] ffssihi2.libgcc ret ; 37 [c=0 l=1] return where all the moves add up to a no-op. The -mno-fuse-move option stops any attempts by the avr backend to clean up that mess. PR rtl-optimization/121340 gcc/ * config/avr/avr.opt (-mfuse-move2): New option. * config/avr/avr-passes.def (avr_pass_2moves): Insert after combine. * config/avr/avr-passes.cc (make_avr_pass_2moves): New function. (pass_data avr_pass_data_2moves): New static variable. (avr_pass_2moves): New rtl_opt_pass. * config/avr/avr-protos.h (make_avr_pass_2moves): New proto. * common/config/avr/avr-common.cc (default_options avr_option_optimization_table) <-mfuse-move2>: Set for -O1 and higher. * doc/invoke.texi (AVR Options) <-mfuse-move2>: Document.
2025-07-28	AVR: target/121277 - Don't load 0x800000 with const __flashx *x = NULL.	Georg-Johann Lay	1	-6/+13
	Converting from generic AS to __flashx used the same rule like for __memx, which tags RAM (generic AS) locations by setting bit 23. The justification was that generic isn't a subset of __flashx, though that lead to surprises with code like const __flashx x = NULL. The natural thing to do is to just load 0x000000 in that case, so that the null pointer works in __flashx as expected. Apart from that, converting NULL to __flashx (or __flash) no more raises a -Waddr-space-convert diagnostic. gcc/ PR target/121277 config/avr/avr.cc (avr_addr_space_convert): When converting from generic AS to __flashx, don't set bit 23. (avr_convert_to_type): Don't -Waddr-space-convert when NULL is converted to __flashx or to __flash.
2025-07-19	AVR: Fuse get_insns with end_sequence.	Georg-Johann Lay	1	-4/+2
	gcc/ * config/avr/avr-passes.cc (avr_optimize_casesi): Fuse get_insns() with end_sequence().
2025-07-06	AVR: Fix a typo in avr-mcus.def.	Georg-Johann Lay	1	-11/+11
	gcc/ * config/avr/avr-mcus.def: -mmcu= takes lower case MCU names. * doc/avr-mmcu.texi: Rebuild.
2025-07-06	AVR: Add support for AVR32DAxxS, AVR64DAxxS, AVR128DAxxS devices.	Georg-Johann Lay	1	-0/+11
	gcc/ * config/avr/avr-mcus.def (avr32da28S, avr32da32S, avr32da48S) (avr64da28S, avr64da32S, avr64da48S avr64da64S) (avr128da28S, avr128da32S, avr128da48S, avr128da64S): Add devices. * doc/avr-mmcu.texi: Rebuild.
2025-06-28	AVR: target/120856 - Deny R24:DI in avr_hard_regno_mode_ok with Reload.	Georg-Johann Lay	1	-1/+1
	This fixes an ICE with -mno-lra when split2 tries to split the following zero_extendsidi2 insn: (set (reg:DI 24) (zero_extend:DI (reg:SI *))) The ICE is because avr_hard_regno_mode_ok allows R24:DI but disallows R28:SI when Reload is used. R28:SI is a result of zero_extendsidi2. This ICE only occurs with Reload (which will die before very long), but it occurs when building libgcc. gcc/ PR target/120856 config/avr/avr.cc (avr_hard_regno_mode_ok) [-mno-lra]: Deny hard regs >= 4 bytes that overlap Y.
2025-06-27	AVR: target/113934 - Use LRA per default.	Georg-Johann Lay	1	-2/+2
	Now that the patches for PR120424 are upstream, the last known bug associated with avr+lra has been fixed: PR118591. So we can pull the switch that turns on LRA per default. This patch only sets -mlra per default. It doesn't do any Reload related cleanup or removal from the avr backend, hence -mno-lra still works. The only new problem is that gcc.dg/torture/pr64088.c fails with LRA but not with Reload. Though that test case is awkward since it is UB but expects the compiler to behave in a specific way which avr-gcc doesn't do: PR116780. This patch also avoids a relative recent ICE that breaks building libgcc: R24:DI is allowed per hard_regno_mode_ok, but R26:SI is disallowed for Reload for old reasons. Outcome is that a split2 pattern for R24:DI = zero_extend:DI (R22:SI) runs into an ICE. AVR-LibC builds fine with this patch. The AVR-LibC testsuite passes without errors. gcc/ PR target/113934 * config/avr/avr.opt (-mlra): Turn on per default.
2025-06-14	AVR: Fix PR120423 / PR116389.	Georg-Johann Lay	1	-0/+35
	The problem with PR120423 and PR116389 is that reload might assign an invalid hard register to a paradoxical subreg. For example with the test case from the PR, it assigns (REG:QI 31) to the inner of (subreg:HI (QI) 0) which is valid, but the subreg will be turned into (REG:HI 31) which is invalid and triggers an ICE in postreload. The problem only occurs with the old reload pass. The patch maps the paradoxical subregs to a zero-extends which will be allocated correctly. For the 120423 testcases, the code is the same like with -mlra (which doesn't implement the fix), so the patch doesn't even introduce a performance penalty. The patch is only needed for v15: v14 is not affected, and in v16 reload will be removed. PR rtl-optimization/120423 PR rtl-optimization/116389 gcc/ * config/avr/avr.md [-mno-lra]: Add pre-reload split to transform (left shift of) a paradoxical subreg to a (left shift of) zero-extend. gcc/testsuite/ * gcc.target/avr/torture/pr120423-1.c: New test. * gcc.target/avr/torture/pr120423-2.c: New test. * gcc.target/avr/torture/pr120423-116389.c: New test. (cherry picked from commit 61789b5abec3079d02ee9eaa7468015ab1f6f701)
2025-05-16	Automatic replacement of end_sequence/return pairs	Richard Sandiford	1	-3/+1
	This is the result of using a regexp to replace: rtx( \|_insn )<stuff> = end_sequence (); return <stuff>; with: return end_sequence (); gcc/ asan.cc (asan_emit_allocas_unpoison): Directly return the result of end_sequence. (hwasan_emit_untag_frame): Likewise. * config/aarch64/aarch64-speculation.cc (aarch64_speculation_clobber_sp): Likewise. (aarch64_speculation_establish_tracker): Likewise. * config/arm/arm.cc (arm_call_tls_get_addr): Likewise. * config/avr/avr-passes.cc (avr_parallel_insn_from_insns): Likewise. * config/sh/sh_treg_combine.cc (sh_treg_combine::make_not_reg_insn): Likewise. * tree-outof-ssa.cc (emit_partition_copy): Likewise.
2025-05-16	Automatic replacement of get_insns/end_sequence pairs	Richard Sandiford	2	-12/+6
	This is the result of using a regexp to replace instances of: <stuff> = get_insns (); end_sequence (); with: <stuff> = end_sequence (); where the indentation is the same for both lines, and where there might be blank lines inbetween. gcc/ * asan.cc (asan_clear_shadow): Use the return value of end_sequence, rather than calling get_insns separately. (asan_emit_stack_protection, asan_emit_allocas_unpoison): Likewise. (hwasan_frame_base, hwasan_emit_untag_frame): Likewise. * auto-inc-dec.cc (attempt_change): Likewise. * avoid-store-forwarding.cc (process_store_forwarding): Likewise. * bb-reorder.cc (fix_crossing_unconditional_branches): Likewise. * builtins.cc (expand_builtin_apply_args): Likewise. (expand_builtin_return, expand_builtin_mathfn_ternary): Likewise. (expand_builtin_mathfn_3, expand_builtin_int_roundingfn): Likewise. (expand_builtin_int_roundingfn_2, expand_builtin_saveregs): Likewise. (inline_string_cmp): Likewise. * calls.cc (expand_call): Likewise. * cfgexpand.cc (expand_asm_stmt, pass_expand::execute): Likewise. * cfgloopanal.cc (init_set_costs): Likewise. * cfgrtl.cc (insert_insn_on_edge, prepend_insn_to_edge): Likewise. (rtl_lv_add_condition_to_bb): Likewise. * config/aarch64/aarch64-speculation.cc (aarch64_speculation_clobber_sp): Likewise. (aarch64_speculation_establish_tracker): Likewise. (aarch64_do_track_speculation): Likewise. * config/aarch64/aarch64.cc (aarch64_load_symref_appropriately) (aarch64_expand_vector_init, aarch64_gen_ccmp_first): Likewise. (aarch64_gen_ccmp_next, aarch64_mode_emit): Likewise. (aarch64_md_asm_adjust): Likewise. (aarch64_switch_pstate_sm_for_landing_pad): Likewise. (aarch64_switch_pstate_sm_for_jump): Likewise. (aarch64_switch_pstate_sm_for_call): Likewise. * config/alpha/alpha.cc (alpha_legitimize_address_1): Likewise. (alpha_emit_xfloating_libcall, alpha_gp_save_rtx): Likewise. * config/arc/arc.cc (hwloop_optimize): Likewise. * config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise. * config/arm/arm-builtins.cc: Likewise. * config/arm/arm.cc (require_pic_register): Likewise. (arm_call_tls_get_addr, arm_gen_load_multiple_1): Likewise. (arm_gen_store_multiple_1, cmse_clear_registers): Likewise. (cmse_nonsecure_call_inline_register_clear): Likewise. (arm_attempt_dlstp_transform): Likewise. * config/avr/avr-passes.cc (bbinfo_t::optimize_one_block): Likewise. (avr_parallel_insn_from_insns): Likewise. * config/avr/avr.cc (avr_prologue_setup_frame): Likewise. (avr_expand_epilogue): Likewise. * config/bfin/bfin.cc (hwloop_optimize): Likewise. * config/c6x/c6x.cc (c6x_expand_compare): Likewise. * config/cris/cris.cc (cris_split_movdx): Likewise. * config/cris/cris.md: Likewise. * config/csky/csky.cc (csky_call_tls_get_addr): Likewise. * config/epiphany/resolve-sw-modes.cc (pass_resolve_sw_modes::execute): Likewise. * config/fr30/fr30.cc (fr30_move_double): Likewise. * config/frv/frv.cc (frv_split_scc, frv_split_cond_move): Likewise. (frv_split_minmax, frv_split_abs): Likewise. * config/frv/frv.md: Likewise. * config/gcn/gcn.cc (move_callee_saved_registers): Likewise. (gcn_expand_prologue, gcn_restore_exec, gcn_md_reorg): Likewise. * config/i386/i386-expand.cc (ix86_expand_carry_flag_compare, ix86_expand_int_movcc): Likewise. (ix86_vector_duplicate_value, expand_vec_perm_interleave2): Likewise. (expand_vec_perm_vperm2f128_vblend): Likewise. (expand_vec_perm_2perm_interleave): Likewise. (expand_vec_perm_2perm_pblendv): Likewise. (expand_vec_perm2_vperm2f128_vblend, ix86_gen_ccmp_first): Likewise. (ix86_gen_ccmp_next): Likewise. * config/i386/i386-features.cc (scalar_chain::make_vector_copies): Likewise. (scalar_chain::convert_reg, scalar_chain::convert_op): Likewise. (timode_scalar_chain::convert_insn): Likewise. * config/i386/i386.cc (ix86_init_pic_reg, ix86_va_start): Likewise. (ix86_get_drap_rtx, legitimize_tls_address): Likewise. (ix86_md_asm_adjust): Likewise. * config/ia64/ia64.cc (ia64_expand_tls_address): Likewise. (ia64_expand_compare, spill_restore_mem): Likewise. (expand_vec_perm_interleave_2): Likewise. * config/loongarch/loongarch.cc (loongarch_call_tls_get_addr): Likewise. * config/m32r/m32r.cc (gen_split_move_double): Likewise. * config/m32r/m32r.md: Likewise. * config/m68k/m68k.cc (m68k_call_tls_get_addr): Likewise. (m68k_call_m68k_read_tp, m68k_sched_md_init_global): Likewise. * config/m68k/m68k.md: Likewise. * config/microblaze/microblaze.cc (microblaze_call_tls_get_addr): Likewise. * config/mips/mips.cc (mips_call_tls_get_addr): Likewise. (mips_ls2_init_dfa_post_cycle_insn): Likewise. (mips16_split_long_branches): Likewise. * config/nvptx/nvptx.cc (nvptx_gen_shuffle): Likewise. (nvptx_gen_shared_bcast, nvptx_propagate): Likewise. (workaround_uninit_method_1, workaround_uninit_method_2): Likewise. (workaround_uninit_method_3): Likewise. * config/or1k/or1k.cc (or1k_init_pic_reg): Likewise. * config/pa/pa.cc (legitimize_tls_address): Likewise. * config/pru/pru.cc (pru_expand_fp_compare, pru_reorg_loop): Likewise. * config/riscv/riscv-shorten-memrefs.cc (pass_shorten_memrefs::transform): Likewise. * config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Likewise. * config/riscv/riscv.cc (riscv_call_tls_get_addr): Likewise. (riscv_frm_emit_after_bb_end): Likewise. * config/rl78/rl78.cc (rl78_emit_libcall): Likewise. * config/rs6000/rs6000.cc (rs6000_debug_legitimize_address): Likewise. * config/s390/s390.cc (legitimize_tls_address): Likewise. (s390_two_part_insv, s390_load_got, s390_va_start): Likewise. * config/sh/sh_treg_combine.cc (sh_treg_combine::make_not_reg_insn): Likewise. * config/sparc/sparc.cc (sparc_legitimize_tls_address): Likewise. (sparc_output_mi_thunk, sparc_init_pic_reg): Likewise. * config/stormy16/stormy16.cc (xstormy16_split_cbranch): Likewise. * config/xtensa/xtensa.cc (xtensa_copy_incoming_a7): Likewise. (xtensa_expand_block_set_libcall): Likewise. (xtensa_expand_block_set_unrolled_loop): Likewise. (xtensa_expand_block_set_small_loop, xtensa_call_tls_desc): Likewise. * dse.cc (emit_inc_dec_insn_before, find_shift_sequence): Likewise. (replace_read): Likewise. * emit-rtl.cc (reorder_insns, gen_clobber, gen_use): Likewise. * except.cc (dw2_build_landing_pads, sjlj_mark_call_sites): Likewise. (sjlj_emit_function_enter, sjlj_emit_function_exit): Likewise. (sjlj_emit_dispatch_table): Likewise. * expmed.cc (expmed_mult_highpart_optab, expand_sdiv_pow2): Likewise. * expr.cc (convert_mode_scalar, emit_move_multi_word): Likewise. (gen_move_insn, expand_cond_expr_using_cmove): Likewise. (expand_expr_divmod, expand_expr_real_2): Likewise. (maybe_optimize_pow2p_mod_cmp, maybe_optimize_mod_cmp): Likewise. * function.cc (emit_initial_value_sets): Likewise. (instantiate_virtual_regs_in_insn, expand_function_end): Likewise. (get_arg_pointer_save_area, make_split_prologue_seq): Likewise. (make_prologue_seq, gen_call_used_regs_seq): Likewise. (thread_prologue_and_epilogue_insns): Likewise. (match_asm_constraints_1): Likewise. * gcse.cc (prepare_copy_insn): Likewise. * ifcvt.cc (noce_emit_store_flag, noce_emit_move_insn): Likewise. (noce_emit_cmove): Likewise. * init-regs.cc (initialize_uninitialized_regs): Likewise. * internal-fn.cc (expand_POPCOUNT): Likewise. * ira-emit.cc (emit_move_list): Likewise. * ira.cc (ira): Likewise. * loop-doloop.cc (doloop_modify): Likewise. * loop-unroll.cc (compare_and_jump_seq): Likewise. (unroll_loop_runtime_iterations, insert_base_initialization): Likewise. (split_iv, insert_var_expansion_initialization): Likewise. (combine_var_copies_in_loop_exit): Likewise. * lower-subreg.cc (resolve_simple_move,resolve_shift_zext): Likewise. * lra-constraints.cc (match_reload, check_and_process_move): Likewise. (process_addr_reg, insert_move_for_subreg): Likewise. (process_address_1, curr_insn_transform): Likewise. (inherit_reload_reg, process_invariant_for_inheritance): Likewise. (inherit_in_ebb, remove_inheritance_pseudos): Likewise. * lra-remat.cc (do_remat): Likewise. * mode-switching.cc (commit_mode_sets): Likewise. (optimize_mode_switching): Likewise. * optabs.cc (expand_binop, expand_twoval_binop_libfunc): Likewise. (expand_clrsb_using_clz, expand_doubleword_clz_ctz_ffs): Likewise. (expand_doubleword_popcount, expand_ctz, expand_ffs): Likewise. (expand_absneg_bit, expand_unop, expand_copysign_bit): Likewise. (prepare_float_lib_cmp, expand_float, expand_fix): Likewise. (expand_fixed_convert, gen_cond_trap): Likewise. (expand_atomic_fetch_op): Likewise. * ree.cc (combine_reaching_defs): Likewise. * reg-stack.cc (compensate_edge): Likewise. * reload1.cc (emit_input_reload_insns): Likewise. * sel-sched-ir.cc (setup_nop_and_exit_insns): Likewise. * shrink-wrap.cc (emit_common_heads_for_components): Likewise. (emit_common_tails_for_components): Likewise. (insert_prologue_epilogue_for_components): Likewise. * tree-outof-ssa.cc (emit_partition_copy): Likewise. (insert_value_copy_on_edge): Likewise. * tree-ssa-loop-ivopts.cc (computation_cost): Likewise.
2025-04-30	AVR: target/119989 - Add missing clobbers to xload_<mode>_libgcc.	Georg-Johann Lay	1	-0/+4
	libgcc's __xload_1...4 is clobbering Z (and also R21 is some cases), but avr.md had clobbers of respective GPRs only up to reload. Outcome was that code reading from the same __memx address twice could be wrong. This patch adds respective clobbers. Forward-port from 2025-04-30 r14-11703 PR target/119989 gcc/ * config/avr/avr.md (xload_<mode>_libgcc): Clobber R21, Z. gcc/testsuite/ * gcc.target/avr/torture/pr119989.h: New file. * gcc.target/avr/torture/pr119989-memx-1.c: New test. * gcc.target/avr/torture/pr119989-memx-2.c: New test. * gcc.target/avr/torture/pr119989-memx-3.c: New test. * gcc.target/avr/torture/pr119989-memx-4.c: New test. * gcc.target/avr/torture/pr119989-flashx-1.c: New test. * gcc.target/avr/torture/pr119989-flashx-2.c: New test. * gcc.target/avr/torture/pr119989-flashx-3.c: New test. * gcc.target/avr/torture/pr119989-flashx-4.c: New test. (cherry picked from commit 1ca1c1fc3b58ae5e1d3db4f5a2014132fe69f82a)
2025-03-23	AVR: Add AVR-SD devices.	Georg-Johann Lay	1	-0/+6
	gcc/ * config/avr/avr-mcus.def: Add AVR32SD20, AVR32SD28, AVR32SD32, AVR64SD28, AVR64SD32, AVR64SD48. * doc/avr-mmcu.texi: Rebuild.
2025-03-22	AVR: Use "avr-peep2-after-fuse-move" for the 2nd run of peephole2.	Georg-Johann Lay	1	-0/+1
	This patch uses a name for the dump file that makes it clear where in the pass chain the 2nd run of peephole2 is located. gcc/ * config/avr/avr.cc (avr_option_override): Use "avr-peep2-after-fuse-move" as dump name instead of "peephole2".
2025-03-22	avr.opt.urls += -muse-nonzero-bits	Georg-Johann Lay	1	-0/+3
	gcc/ * config/avr/avr.opt.urls: Add -muse-nonzero-bits.
2025-03-22	AVR: target/119421 Better optimize some bit operations.	Georg-Johann Lay	6	-0/+410
	There are occasions where knowledge about nonzero bits makes some optimizations possible. For example, Rd \|= Rn << Off can be implemented as SBRC Rn, 0 ORI Rd, 1 << Off when Rn in { 0, 1 }, i.e. nonzero_bits (Rn) == 1. This patch adds some patterns that exploit nonzero_bits() in some combiner patterns. As insn conditions are not supposed to contain nonzero_bits(), the patch splits such insns right after pass insn combine. PR target/119421 gcc/ * config/avr/avr.opt (-muse-nonzero-bits): New option. * config/avr/avr-protos.h (avr_nonzero_bits_lsr_operands_p): New. (make_avr_pass_split_nzb): New. * config/avr/avr.cc (avr_nonzero_bits_lsr_operands_p): New function. (avr_rtx_costs_1): Return costs for the new insns. * config/avr/avr.md (nzb): New insn attribute. (nzb=1.<code>...): New insns to better support some bit operations for <code> in AND, IOR, XOR. config/avr/avr-passes.def (avr_pass_split_nzb): Insert pass atfer combine. * config/avr/avr-passes.cc (avr_pass_data_split_nzb). New pass data. (avr_pass_split_nzb): New pass. (make_avr_pass_split_nzb): New function. * common/config/avr/avr-common.cc (avr_option_optimization_table): Enable -muse-nonzero-bits for -O2 and higher. * doc/invoke.texi (AVR Options): Document -muse-nonzero-bits. gcc/testsuite/ * gcc.target/avr/torture/pr119421-sreg.c: New test.
2025-03-22	AVR: Add attribute "used" for code in .initN and .initN sections.	Georg-Johann Lay	1	-11/+34
	Code in .initN and .initN sections is never called since these sections are special and part of the startup resp. shutdown code. This patch adds attribute "used" so they won't be optimized out. gcc/ * config/avr/avr.cc (avr_attrs_section_name): New function. (avr_insert_attributes): Add "used" attribute to functions in .initN and .finiN.
2025-03-18	AVR: target/119355 - Fix ICE in pass avr-fuse-move / -mfuse-move.	Georg-Johann Lay	1	-22/+32
	This ICE only occurred when the compiler is built with, say CXXFLAGS='-Wp,-D_GLIBCXX_ASSERTIONS'. The problem was that a value from an illegal REGNO was read. The value was not used in these cases, but the access triggered an assertion due to reading past std::array. gcc/ PR target/119355 * config/avr/avr-passes.cc (memento_t::apply): Only read values[p.arg] when it is actually used.
2025-03-02	avr: Fix up avr_print_operand diagnostics [PR118991]	Jakub Jelinek	1	-4/+13
	As can be seen in gcc/po/gcc.pot: #: config/avr/avr.cc:2754 #, c-format msgid "bad I/O address 0x" msgstr "" exgettext couldn't retrieve the whole format string in this case, because it uses a macro in the middle. output_operand_lossage is c-format function though, so we can't use %wx to print HOST_WIDE_INT, and HOST_WIDE_INT_PRINT_HEX_PURE is on some hosts %lx, on others %llx and on others %I64x so isn't really translatable that way. As Joseph mentioned in the PR, there is no easy way around this but go through a temporary buffer, which the following patch does. 2025-03-02 Jakub Jelinek <jakub@redhat.com> PR translation/118991 * config/avr/avr.cc (avr_print_operand): Print ival into a temporary buffer and use %s in output_operand_lossage to make the diagnostics translatable.
2025-02-17	AVR: ad target/118764 - Mention CVT availability in device-specs comment.	Georg-Johann Lay	1	-2/+3
	gcc/ PR target/118764 * config/avr/gen-avr-mmcu-specs.cc (print_mcu) [has CVT]: Mention CVT in header comment of generated specs file.
2025-02-16	AVR: ad target/118764 - Let -mcvt set built-in macro __AVR_CVT__	Georg-Johann Lay	1	-0/+3
	gcc/ PR target/118764 * config/avr/avr-c.cc (avr_cpu_cpp_builtins) [TARGET_CVT]: Define __AVR_CVT__. * doc/invoke.texi (AVR Built-in Macros): Document __AVR_CVT__.
2025-02-16	AVR: Don't asm output operations for unused result bytes.	Georg-Johann Lay	1	-10/+48
	When REG_UNUSED notes indicate that some result bytes are not used by the following code, then there's no need to asm out them. The patch uses such notes for the asm out of AND, IOR, XOR, PLUS, MINUS. gcc/ * config/avr/avr.cc (avr_result_regno_unused_p): New static function. (avr_out_bitop): Only output result bytes that are used. (avr_out_plus_1): Same.
2025-02-16	AVR: Diagnose unsupported built-ins in avr_resolve_overloaded_builtin.	Georg-Johann Lay	3	-30/+43
	This patch executes avr_builtin_supported_p at a later time and in avr_resolve_overloaded_builtin. This allows for better diagnostics and avoids lto1 hiccups when a built-in decl is NULL_TREE. gcc/ * config/avr/avr-protos.h (avr_builtin_supported_p): Remove. * config/avr/avr.cc (avr_init_builtins): Don't initialize non-available built-ins with NULL_TREE. (avr_builtin_supported_p): Move to... * config/avr/avr-c.cc: ...here. (avr_resolve_overloaded_builtin): Run avr_builtin_supported_p.
2025-02-14	AVR: target/118878 - Don't ICE on result from paradoxical reg's alloc.	Georg-Johann Lay	1	-7/+13
	After register allocation, paradoxical subregs may become something like r20:SI += r22:SI which doesn't make much sense as assembly code. Hence avr_out_plus_1() used to ICE on such code. However, paradoxical subregs appear to be a common optimization device (instead of proper mode demotion). PR target/118878 gcc/ * config/avr/avr.cc (avr_out_plus_1): Don't ICE on result of paradoxical reg's register allocation. gcc/testsuite/ * gcc.target/avr/torture/pr118878.c: New test.
2025-02-12	avr.opt.urls += -mcall-main	Georg-Johann Lay	1	-0/+3
	gcc/ * config/avr/avr.opt.urls: Add -mcall-main.
2025-02-12	AVR: target/118806 - Add -mno-call-main to tweak running main().	Georg-Johann Lay	2	-0/+55
	On devices with very limited resources, it may be desirable to run main in a more efficient way than provided by the startup code XCALL main XJMP exit from section .init9. In AVR-LibC v2.3, that code has been moved to libmcu.a, hence symbol __call_main can be satisfied so that the respective code is no more pulled in from that library. Instead, main can be run by putting it in section .init9. The patch adds attributes noreturn and section(".init9"), and sets __call_main=0 when it encounters main(). gcc/ PR target/118806 * config/avr/avr.opt (-mcall-main): New option and... (avropt_call_main): ...variable. * config/avr/avr.cc (avr_no_call_main_p): New variable. (avr_insert_attributes) [-mno-call-main, main]: Add attributes noreturn and section(".init9") to main. Set avr_no_call_main_p. (avr_file_end) [avr_no_call_main_p]: Define symbol __call_main. * doc/invoke.texi (AVR Options) <-mno-call-main>: Document. <-mnodevicelib>: Extend explanation.
2025-02-06	avr.opt.urls += -mcvt	Georg-Johann Lay	1	-0/+3
	gcc/ * config/avr/avr.opt.urls: Add mcvt.
2025-02-06	AVR: Add support for a Compact Vector Table (-mcvt).	Georg-Johann Lay	5	-117/+142
	Some AVR devices support a CVT: - Devices from the 0-series, 1-series, 2-series. - AVR16, AVR32, AVR64, AVR128 devices. The support is provided by means of a startup code file crt<mcu>-cvt.o from AVR-LibC v2.3 that can be linked instead of the traditional crt<mcu>.o. This patch adds a new command line option -mcvt that links that CVT startup code (or issues an error when the device doesn't support a CVT). PR target/118764 gcc/ * config/avr/avr.opt (-mcvt): New target option. * config/avr/avr-arch.h (AVR_CVT): New enum value. * config/avr/avr-mcus.def: Add AVR_CVT flag for devices that support it. * config/avr/avr.cc (avr_handle_isr_attribute) [TARGET_CVT]: Issue an error when a vector number larger that 3 is used. * config/avr/gen-avr-mmcu-specs.cc (McuInfo.have_cvt): New property. (print_mcu) <avrlibc_startfile>: Use crt<mcu>-cvt.o depending on -mcvt (or issue an error when the device doesn't support a CVT). doc/invoke.texi (AVR Options): Document -mcvt.
2025-02-06	AVR: genmultilib.awk - Use more robust parsing of spaces.	Georg-Johann Lay	1	-5/+32
	gcc/ PR target/118768 * config/avr/genmultilib.awk: Parse the AVR_MCU lines in a more robust way w.r.t. white spaces.
2025-01-30	AVR: Provide built-ins for strlen where the string lives in some AS.	Georg-Johann Lay	2	-1/+40
	This patch adds built-in functions __builtin_avr_strlen_flash, __builtin_avr_strlen_flashx and __builtin_avr_strlen_memx. Purpose is that higher-level functions can use __builtin_constant_p on strlen without raising a diagnostic due to -Waddr-space-convert. gcc/ * config/avr/builtins.def (STRLEN_FLASH, STRLEN_FLASHX) (STRLEN_MEMX): New DEF_BUILTIN's. * config/avr/avr.cc (avr_ftype_strlen): New static function. (avr_builtin_supported_p): New built-ins are not for AVR_TINY. (avr_init_builtins) <strlen_flash_node, strlen_flashx_node, strlen_memx_node>: Provide new fntypes. (avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH] [AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if possible. * doc/extend.texi (AVR Built-in Functions): Document __builtin_avr_strlen_flash, __builtin_avr_strlen_flashx, __builtin_avr_strlen_memx. libgcc/ * config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx. * config/avr/lib1funcs.S <L_strlen_memx, __strlen_memx>: Implement.
2025-01-30	AVR: Only provide a built-in when it is available.	Georg-Johann Lay	4	-4/+32
	Some built-ins are not available for C++ since they are using named address-spaces or fixed-point types. gcc/ * config/avr/builtins.def (AVR_FIRST_C_ONLY_BUILTIN_ID): New macro. * config/avr/avr-protos.h (avr_builtin_supported_p): New. * config/avr/avr.cc (avr_builtin_supported_p): New function. (avr_init_builtins): Only provide a built-in when it is supported. * config/avr/avr-c.cc (avr_cpu_cpp_builtins): Only define the __BUILTIN_AVR_<NAME> build-in defines when the associated built-in function is supported. * doc/extend.texi (AVR Built-in Functions): Add a note that following built-ins are supported for only for GNU-C.
2025-01-29	AVR: Allow to share libgcc's __negsi2.	Georg-Johann Lay	1	-0/+9
	libgcc has a module for __negsi2: REG_22:SI := - REG_22:SI. This patch adds a pattern that allows to share that function provided optimize_size. gcc/ * config/avr/avr.md (*negsi2.libgcc): New insn.
2025-01-23	AVR: PR118012 - Try to work around sick code from match.pd.	Georg-Johann Lay	4	-7/+564
	This patch tries to work around PR118012 which may use a full fledged multiplication instead of a simple bit test. This is because match.pd's /* (zero_one == 0) ? y : z <op> y -> ((typeof(y))zero_one * z) <op> y / / (zero_one != 0) ? z <op> y : y -> ((typeof(y))zero_one * z) <op> y / "optimizes" code with op in { plus, ior, xor } like if (a & 1) b = b <op> c; to something like: x1 = EXTRACT_BIT0 (a); x2 = c MULT x1; b = b <op> x2; or x1 = EXTRACT_BIT0 (a); x2 = ZERO_EXTEND (x1); x3 = NEG x2; x4 = a AND x3: b = b <op> x4; which is very expensive and may even result in a libgcc call for a 32-bit multiplication on devices that don't even have MUL. Notice that EXTRACT_BIT0 is already more expensive (slower, more code, more register pressure) than a bit-test + branch. The patch: o Adds some combiner patterns that try to map sick code back to a bit test + branch. o Adjusts costs to make MULT (x AND 1) cheap, in the hope that the middle-end will use that alternative (which we map to sane code). o On devices without MUL, 32-bit multiplication was performed by a library call, which bypasses the MULT (x AND 1) and similar patterns. Therefore, mulsi3 is also allowed for devices without MUL so that we get at MULT pattern that can be transformed. (Though this is not possible on AVR_TINY since it passes arguments on the stack). o Add a new command line option -mpr118012, so most of the patterns and cost computations can be switched off as they have avropt_pr118012 in their insn condition. o Added sign-extract.0 patterns unconditionally (no avropt_pr118012). Notice that this patch is just a work-around, it's not a fix of the root cause, which are the patterns in match.pd that don't care about the target and don't even care about costs. The work-around is incomplete, and 3 of the new tests are still failing. This is because there are situations where it does not work: The MULT is realized as a library call. * The MULT is realized as an ASHIFT, and the ASHIFT again is transformed into something else. For example, with -O2 -mmcu=atmega128, ASHIFT(3) is transformed into ASHIFT(1) + ASHIFT(2). PR tree-optimization/118012 PR tree-optimization/118360 gcc/ * config/avr/avr.opt (-mpr118012): New undocumented option. * config/avr/avr-protos.h (avr_out_sextr) (avr_emit_skip_pixop, avr_emit_skip_clear): New protos. * config/avr/avr.cc (avr_adjust_insn_length) [case ADJUST_LEN_SEXTR]: Handle case. (avr_rtx_costs_1) [NEG]: Costs for NEG (ZERO_EXTEND (ZERO_EXTRACT)). [MULT && avropt_pr118012]: Costs for MULT (x AND 1). (avr_out_sextr, avr_emit_skip_pixop, avr_emit_skip_clear): New functions. * config/avr/avr.md [avropt_pr118012]: Add combine patterns with that condition that try to work around PR118012. (adjust_len) <sextr>: Add insn attr value. (pixop): New code iterator. (mulsi3) [avropt_pr118012 && !AVR_TINY]: Allow these in insn condition. gcc/testsuite/ * gcc.target/avr/mmcu/pr118012-1.h: New file. * gcc.target/avr/mmcu/pr118012-1-o2-m128.c: New test. * gcc.target/avr/mmcu/pr118012-1-os-m128.c: New test. * gcc.target/avr/mmcu/pr118012-1-o2-m103.c: New test. * gcc.target/avr/mmcu/pr118012-1-os-m103.c: New test. * gcc.target/avr/mmcu/pr118012-1-o2-t40.c: New test. * gcc.target/avr/mmcu/pr118012-1-os-t40.c: New test. * gcc.target/avr/mmcu/pr118360-1.h: New file. * gcc.target/avr/mmcu/pr118360-1-o2-m128.c: New test. * gcc.target/avr/mmcu/pr118360-1-os-m128.c: New test. * gcc.target/avr/mmcu/pr118360-1-o2-m103.c: New test. * gcc.target/avr/mmcu/pr118360-1-os-m103.c: New test. * gcc.target/avr/mmcu/pr118360-1-o2-t40.c: New test. * gcc.target/avr/mmcu/pr118360-1-os-t40.c: New test.
2025-01-23	AVR: PR117726 - Tweak 32-bit logical shifts of 25...30 for -Oz.	Georg-Johann Lay	3	-37/+175
	As it turns out, logical 32-bit shifts with an offset of 25..30 can be performed in 7 instructions or less. This beats the 7 instruc- tions required for the default code of a shift loop. Plus, with zero overhead, these cases can be 3-operand. This is only relevant for -Oz because with -Os, 3op shifts are split with -msplit-bit-shift (which is not performed with -Oz). PR target/117726 gcc/ * config/avr/avr.cc (avr_ld_regno_p): New function. (ashlsi3_out) [case 25,26,27,28,29,30]: Handle and tweak. (lshrsi3_out): Same. (avr_rtx_costs_1) [SImode, ASHIFT, LSHIFTRT]: Adjust costs. * config/avr/avr.md (ashlsi3, ashlsi3, ashlsi3_const): Add "r,r,C4L" alternative. (lshrsi3, lshrsi3, lshrsi3_const): Add "r,r,C4R" alternative. * config/avr/constraints.md (C4R, C4L): New, gcc/testsuite/ * gcc.target/avr/torture/avr-torture.exp (AVR_TORTURE_OPTIONS): Turn one option variant into -Oz.
2025-01-21	AVR: Tweak some 16-bit shifts by using MUL.	Georg-Johann Lay	3	-10/+85
	u16 << 5 and u16 << 6 can be tweaked by using MUL instructions. Benefit is a better speed ratio with -Os and smaller size with -O2. gcc/ * config/avr/avr-passes.cc (avr_emit_shift) [ASHIFT,HImode]: Allow offsets 5 and 6 as 3op provided have MUL and a scratch. * config/avr/avr.cc (avr_optimize_size_max_p): New function. (avr_out_ashlhi3_mul): New function. (ashlhi3_out) [case 4, 5, 6]: Better speed for -Os. * config/avr/avr.md (isa) <mul, no_mul>: New attr values. (*ashlhi3_const): Add alternative for offsets 5 and 6.
2025-01-17	AVR: Add "const" attribute to avr built-in functions if possible.	Georg-Johann Lay	3	-106/+109
	gcc/ * config/avr/avr-c.cc (DEF_BUILTIN): Add ATTRS argument to macro definition. * config/avr/avr.cc: Same. (avr_init_builtins) <attr_const>: New variable that can be used as ATTRS argument in DEF_BUILTIN. * config/avr/builtins.def (DEF_BUILTIN): Add ATTRS parameter to all definitions.
2025-01-17	AVR: Use INT_N to built-in define __int24.	Georg-Johann Lay	2	-5/+8
	This patch uses the INT_N interface to define __int24 in avr-modes.def. Since the testsuite uses -Wpedantic and __int24 is a C/C++ extension, uses of __int24 and __uint24 is now marked as __extension__. PR target/118329 gcc/ * config/avr/avr-modes.def: Add INT_N (PSI, 24). * config/avr/avr.cc (avr_init_builtin_int24) <__int24>: Remove definition. <__uint24>: Adjust definition to INT_N interface. gcc/testsuite/ * gcc.target/avr/pr115830-add.c (__int24, __uint24): Add __extension__ to respective typedefs. * gcc.target/avr/pr115830-sub-ext.c: Same. * gcc.target/avr/pr115830-sub.c: Same. * gcc.target/avr/torture/get-mem.c: Same. * gcc.target/avr/torture/set-mem.c: Same. * gcc.target/avr/torture/ifelse-c.h: Same. * gcc.target/avr/torture/ifelse-d.h: Same. * gcc.target/avr/torture/ifelse-q.h: Same. * gcc.target/avr/torture/ifelse-r.h: Same. * gcc.target/avr/torture/int24-mul.c: Same. * gcc.target/avr/torture/pr109907-2.c: Same. * gcc.target/avr/torture/pr61443.c: Same. * gcc.target/avr/torture/pr63633-ice-mult.c: Same. * gcc.target/avr/torture/shift-l-u24.c: Same. * gcc.target/avr/torture/shift-r-i24.c: Same. * gcc.target/avr/torture/shift-r-u24.c: Same. * gcc.target/avr/torture/add-extend.c: Same. * gcc.target/avr/torture/sub-extend.c: Same. * gcc.target/avr/torture/sub-zerox.c: Same. * gcc.target/avr/torture/test-gprs.h: Same.
2025-01-02	Update copyright years.	Jakub Jelinek	30	-31/+31

2024-12-12	AVR: target/118000 - Fix copymem from address-spaces.	Georg-Johann Lay	1	-2/+15
	* rampz_rtx et al. were missing MEM_VOLATILE_P. This is needed because avr_emit_cpymemhi is setting RAMPZ explicitly with an own insn. * avr_out_cpymem was missing a final RAMPZ = 0 on EBI devices. This only affects the __flash1 ... __flash5 spaces since the other ASes use different routines, gcc/ PR target/118000 * config/avr/avr.cc (avr_init_expanders) <sreg_rtx> <rampd_rtx, rampx_rtx, rampy_rtx, rampz_rtx>: Set MEM_VOLATILE_P. (avr_out_cpymem) [ELPM && EBI]: Restore RAMPZ to 0 after.
2024-12-12	AVR: Assert minimal required bit width of section_common::flags.	Georg-Johann Lay	1	-0/+29
	gcc/ * config/avr/avr.cc (avr_ctz): New constexpr function. (section_common::flags): Assert minimal bit width.
2024-12-12	AVR: target/118001 - Add __flashx as 24-bit named address space.	Georg-Johann Lay	6	-114/+361
	This patch adds __flashx as a new named address space that allocates objects in .progmemx.data. The handling is mostly the same or similar to that of 24-bit space __memx, except that the asm routines are simpler and more efficient. Loads are emit inline when ELPMX or LPMX is available. The address space uses a 24-bit addresses even on devices with a program memory size of 64 KiB or less. PR target/118001 gcc/ * doc/extend.texi (AVR Named Address Spaces): Document __flashx. * config/avr/avr.h (ADDR_SPACE_FLASHX): New enum value. * config/avr/avr-protos.h (avr_out_fload, avr_mem_flashx_p) (avr_fload_libgcc_p, avr_load_libgcc_mem_p) (avr_load_libgcc_insn_p): New. * config/avr/avr.cc (avr_addrspace): Add ADDR_SPACE_FLASHX. (avr_decl_flashx_p, avr_mem_flashx_p, avr_fload_libgcc_p) (avr_load_libgcc_mem_p, avr_load_libgcc_insn_p, avr_out_fload): New functions. (avr_adjust_insn_length) [ADJUST_LEN_FLOAD]: Handle case. (avr_progmem_p) [avr_decl_flashx_p]: return 2. (avr_addr_space_legitimate_address_p) [ADDR_SPACE_FLASHX]: Has same behavior like ADDR_SPACE_MEMX. (avr_addr_space_convert): Use pointer sizes rather then ASes. (avr_addr_space_contains): New function. (avr_convert_to_type): Use it. (avr_emit_cpymemhi): Handle ADDR_SPACE_FLASHX. * config/avr/avr.md (adjust_len) <fload>: New attr value. (gen_load<mode>_libgcc): Renamed from load<mode>_libgcc. (xload8<mode>_A): Iterate over MOVMODE rather than over ALL1. (fxmov<mode>_A): New from xloadv<mode>_A. (xmov<mode>_8): New from xload<mode>_A. (fmov<mode>): New insns. (fxload<mode>_A): New from xload<mode>_A. (fxload_<mode>_libgcc): New from xload_<mode>_libgcc. (fxload_<mode>_libgcc): New from xload_<mode>_libgcc. (mov<mode>) [avr_mem_flashx_p]: Hande ADDR_SPACE_FLASHX. (cpymemx_<mode>): Make sure the address space is not lost when splitting. (cpymemx_<mode>) [ADDR_SPACE_FLASHX]: Use __movmemf_<mode> for asm. (ashlqi.1.zextpsi_split): New combine pattern. * config/avr/predicates.md (nox_general_operand): Don't match when avr_mem_flashx_p is true. * config/avr/avr-passes.cc (AVR_LdSt_Props): ADDR_SPACE_FLASHX has no post_inc. gcc/testsuite/ * gcc.target/avr/torture/addr-space-1.h [AVR_HAVE_ELPM]: Use a function to bump .progmemx.data to a high address. * gcc.target/avr/torture/addr-space-2.h: Same. * gcc.target/avr/torture/addr-space-1-fx.c: New test. * gcc.target/avr/torture/addr-space-2-fx.c: New test. libgcc/ * config/avr/t-avr (LIB1ASMFUNCS): Add _fload_1, _fload_2, _fload_3, _fload_4, _movmemf. * config/avr/lib1funcs.S (.branch_plus): New .macro. (__xload_1, __xload_2, __xload_3, __xload_4): When the address is located in flash, then forward to... (__fload_1, __fload_2, __fload_3, __fload_4): ...these new functions, respectively. (__movmemx_hi): When the address is located in flash, forward to... (__movmemf_hi): ...this new function.