aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
111 min.rs6000: Remove redundant guard for float128 mode patternHEADtrunkmasterHaochen Gui1-58/+57
gcc/ * config/rs6000/rs6000.md (mov<mode>cc, *mov<mode>cc_p10, *mov<mode>cc_invert_p10, *fpmask<mode>, *xxsel<mode>, @ieee_128bit_vsx_abs<mode>2, *ieee_128bit_vsx_nabs<mode>2, add<mode>3, sub<mode>3, mul<mode>3, div<mode>3, sqrt<mode>2, copysign<mode>3, copysign<mode>3_hard, copysign<mode>3_soft, @neg<mode>2_hw, @abs<mode>2_hw, *nabs<mode>2_hw, fma<mode>4_hw, *fms<mode>4_hw, *nfma<mode>4_hw, *nfms<mode>4_hw, extend<SFDF:mode><IEEE128:mode>2_hw, trunc<mode>df2_hw, trunc<mode>sf2_hw, fix<uns>_<IEEE128:mode><SDI:mode>2_hw, fix<uns>_trunc<IEEE128:mode><QHI:mode>2, *fix<uns>_trunc<IEEE128:mode><QHSI:mode>2_mem, float_<mode>di2_hw, float_<mode>si2_hw, float<QHI:mode><IEEE128:mode>2, floatuns_<mode>di2_hw, floatuns_<mode>si2_hw, floatuns<QHI:mode><IEEE128:mode>2, floor<mode>2, ceil<mode>2, btrunc<mode>2, round<mode>2, add<mode>3_odd, sub<mode>3_odd, mul<mode>3_odd, div<mode>3_odd, sqrt<mode>2_odd, fma<mode>4_odd, *fms<mode>4_odd, *nfma<mode>4_odd, *nfms<mode>4_odd, trunc<mode>df2_odd, *cmp<mode>_hw for IEEE128): Remove guard FLOAT128_IEEE_P. (@extenddf<mode>2_fprs, @extenddf<mode>2_vsx, trunc<mode>df2_internal1, trunc<mode>df2_internal2, fix_trunc_helper<mode>, neg<mode>2, *cmp<mode>_internal1, *cmp<IBM128:mode>_internal2 for IBM128): Remove guard FLOAT128_IBM_P.
3 hoursrs6000: Change optab for ibm128 and ieee128 conversionKewen Lin1-6/+6
Currently for 128 bit floating-point ibm128 and ieee128 formats conversion, the corresponding libcalls are: ibm128 -> ieee128 "__trunctfkf2" ieee128 -> ibm128 "__extendkftf2" , and generic code handling (like convert_mode_scalar) also adopts sext_optab for ieee128 -> ibm128 while trunc_optab for ibm128 -> ieee128. But in rs6000 port as function rs6000_expand_float128_convert and init_float128_ieee show, we adopt sext_optab for ibm128 -> ieee128 with "__trunctfkf2" while trunc_optab for ieee128 -> ibm128 with "__extendkftf2". To make them consistent and avoid some surprises, this patch is to adjust rs6000 internal handlings by adopting trunc_optab for ibm128 -> ieee128 with "__trunctfkf2" while sext_optab for ieee128 -> ibm128 with "__extendkftf2". gcc/ChangeLog: * config/rs6000/rs6000.cc (init_float128_ieee): Use trunc_optab rather than sext_optab for converting FLOAT128_IBM_P mode to FLOAT128_IEEE_P mode, and use sext_optab rather than trunc_optab for converting FLOAT128_IEEE_P mode to FLOAT128_IBM_P mode. (rs6000_expand_float128_convert): Likewise.
3 hourstree: Remove KFmode workaround [PR112993]Kewen Lin1-9/+0
The fix for PR112993 makes KFmode have 128 bit mode precision, we don't need this workaround to fix up the type precision any more, and just go with mode precision. So this patch is to remove KFmode workaround. PR target/112993 gcc/ChangeLog: * tree.cc (build_common_tree_nodes): Drop the workaround for rs6000 KFmode precision adjustment.
3 hoursranger: Revert the workaround introduced in PR112788 [PR112993]Kewen Lin1-8/+2
This reverts commit r14-6478-gfda8e2f8292a90 "range: Workaround different type precision between _Float128 and long double [PR112788]" as the fixes for PR112993 make all 128 bits scalar floating point have the same 128 bit precision, this workaround isn't needed any more. PR target/112993 gcc/ChangeLog: * value-range.h (range_compatible_p): Remove the workaround on different type precision between _Float128 and long double.
3 hoursfortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]Kewen Lin1-1/+15
Previously effective target fortran_real_c_float128 never passes on Power regardless of the default 128 long double is ibmlongdouble or ieeelongdouble. It's due to that TF mode is always used for kind 16 real, which has precision 127, while the node float128_type_node for c_float128 has 128 type precision, get_real_kind_from_node can't find a matching as it only checks gfc_real_kinds[i].mode_precision and type precision. With changing TFmode/IFmode/KFmode to have the same mode precision 128, now fortran_real_c_float12 can pass with ieeelongdouble enabled by default and test cases guarded with it get tested accordingly. But with ibmlongdouble enabled by default, since TFmode has precision 128 which is the same as type precision 128 of float128_type_node, get_real_kind_from_node considers kind for TFmode matches float128_type_node, but it's wrong as at this time point TFmode is with ibm extended format. So this patch is to teach get_real_kind_from_node to check one more field which can be differentiable from the underlying real format, it can avoid the unexpected matching when there more than one modes have the same precisoin. PR target/112993 gcc/fortran/ChangeLog: * trans-types.cc (get_real_kind_from_node): Consider the case where more than one modes have the same precision.
3 hoursrs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]Kewen Lin6-108/+41
On rs6000, there are three 128 bit scalar floating point modes TFmode, IFmode and KFmode. With some historical reasons, we defines them with different mode precisions, that is KFmode 126, TFmode 127 and IFmode 128. But in fact all of them should have the same mode precision 128, this special setting has caused some issues like some unexpected failures mentioned in [1] and also made us have to introduce some workarounds, such as: the workaround in build_common_tree_nodes for KFmode 126, the workaround in range_compatible_p for same mode but different precision issue. This patch is to make these three 128 bit scalar floating point modes TFmode, IFmode and KFmode have 128 bit mode precision, and keep the order same as previous in order to make machine independent parts of the compiler not try to widen IFmode to TFmode. Besides, build_common_tree_nodes adopts the newly added hook mode_for_floating_type so we don't need to worry about unexpected mode for long double type node. In function convert_mode_scalar, with the proposed change, it adopts sext_optab for converting ieee128 format mode to ibm128 format mode while trunc_optab for converting ibm128 format mode to ieee128 format mode. Thus this patch removes useless extend and trunc optab supports, supplements new define_expands expandkftf2 and trunctfkf2 to align with convert_mode_scalar implementation. It also unnames two define_insn_and_split to avoid conflicts and make them more clear. Considering the current implementation that there is no chance to have KF <-> IF conversion (since either of them would be TF already), it adds two dummy define_expands to assert this. [1] https://inbox.sourceware.org/gcc-patches/ 718677e7-614d-7977-312d-05a75e1fd5b4@linux.ibm.com/ PR target/112993 gcc/ChangeLog: * config/rs6000/rs6000-modes.def (IFmode, KFmode, TFmode): Define with FLOAT_MODE instead of FRACTIONAL_FLOAT_MODE, don't use special precisions any more. (rs6000-modes.h): Remove include. * config/rs6000/rs6000-modes.h: Remove. * config/rs6000/rs6000.h (rs6000-modes.h): Remove include. * config/rs6000/t-rs6000: Remove rs6000-modes.h include. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace all uses of FLOAT_PRECISION_TFmode with 128. (rs6000_c_mode_for_floating_type): Likewise. * config/rs6000/rs6000.md (define_expand extendiftf2): Remove. (define_expand extendifkf2): Remove. (define_expand extendtfkf2): Remove. (define_expand trunckftf2): Remove. (define_expand trunctfif2): Remove. (define_expand extendtfif2): Add new assertion. (define_expand expandkftf2): New. (define_expand trunciftf2): Add new assertion. (define_expand trunctfkf2): New. (define_expand truncifkf2): Change with gcc_unreachable. (define_expand expandkfif2): New. (define_insn_and_split extendkftf2): Rename to ... (define_insn_and_split *extendkftf2): ... this. (define_insn_and_split trunctfkf2): Rename to ... (define_insn_and_split *extendtfkf2): ... this.
3 hoursexpr: Allow same precision modes conversion between {ibm_extended, ↵Kewen Lin2-10/+33
ieee_quad}_format With some historical reasons, rs6000 defines KFmode, TFmode and IFmode to have different mode precisions, but it causes some issues and needs some workarounds such as PR112993. So we are going to make all rs6000 128 bit scalar FP modes have 128 bit precision. Be prepared for that, this patch is to make function convert_mode_scalar allow same precision FP modes conversion if their underlying formats are ibm_extended_format and ieee_quad_format respectively, just like the existing special treatment on arm_bfloat_half_format <-> ieee_half_format. It also factors out all the relevant checks into a lambda function. Besides, similar to ieee fp16 -> bfloat conversion, it adopts trunc_optab rather than sext_optab for ibm128 to ieee128 conversion. PR target/112993 gcc/ChangeLog: * expr.cc (convert_mode_scalar): Allow same precision conversion between scalar floating point modes if whose underlying format is ibm_extended_format or ieee_quad_format, and refactor assertion with new lambda function acceptable_same_precision_modes. Use trunc_optab rather than sext_optab for ibm128 to ieee128 conversion. * optabs-libfuncs.cc (gen_trunc_conv_libfunc): Use trunc_optab rather than sext_optab for ibm128 to ieee128 conversion.
4 hourslibbacktrace: update xcoff.c for base_address changesIan Lance Taylor3-21/+31
* xcoff.c (struct xcoff_fileline_data): Change base_address field to struct libbacktrace_base_address. (xcoff_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (xcoff_initialize_fileline): Likewise. (xcoff_lookup_pc): Use libbacktrace_add_base. (xcoff_add): Change base_address to struct libbacktrace_base_address. (xcoff_armem_add, xcoff_add_shared_libs): Likewise. (backtrace_initialize): Likewise. * Makefile.am (xcoff.lo): Remove unused target. (xcoff_32.lo, xcoff_64.lo): New targets. * Makefile.in: Regenerate.
7 hoursrs6000: Error on CPUs and ABIs that don't support the ROP protection insns ↵Peter Bergner4-18/+39
[PR114759] We currently silently ignore the -mrop-protect option for old CPUs we don't support with the ROP hash insns, but we throw an error for unsupported ABIs. This patch treats unsupported CPUs and ABIs similarly by throwing an error both both. This matches clang behavior and allows us to simplify our tests in the code that generates our prologue and epilogue code. 2024-06-26 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/114759 * config/rs6000/rs6000.cc (rs6000_option_override_internal): Disallow CPUs and ABIs that do no support the ROP protection insns. * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Remove now unneeded tests. (rs6000_emit_prologue): Likewise. Remove unneeded gcc_assert. (rs6000_emit_epilogue): Likewise. * config/rs6000/rs6000.md: Likewise. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-3.c: New test.
7 hoursrs6000: ROP - Emit hashst and hashchk insns on Power8 and later [PR114759]Peter Bergner4-7/+24
We currently only emit the ROP-protect hash* insns for Power10, where the insns were added to the architecture. We want to emit them for earlier cpus (where they operate as NOPs), so that if those older binaries are ever executed on a Power10, then they'll be protected from ROP attacks. Binutils accepts hashst and hashchk back to Power8, so change GCC to emit them for Power8 and later. This matches clang's behavior. 2024-06-19 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/114759 * config/rs6000/rs6000-logue.cc (rs6000_stack_info): Use TARGET_POWER8. (rs6000_emit_prologue): Likewise. * config/rs6000/rs6000.md (hashchk): Likewise. (hashst): Likewise. Fix whitespace. gcc/testsuite/ PR target/114759 * gcc.target/powerpc/pr114759-2.c: New test. * lib/target-supports.exp (rop_ok): Use check_effective_target_has_arch_pwr8.
7 hoursc++/modules: Propagate BINDING_VECTOR_*_DUPS_P on realloc [PR99242]Nathaniel Shead5-0/+20
When importing modules, when a binding vector for a name runs out of slots it gets reallocated with a larger size, and existing bindings are copied across. However, the flags to indicate whether deduping needs to occur did not: this causes ICEs, as it allows a duplicate binding to be added which then violates assumptions later on. PR c++/99242 gcc/cp/ChangeLog: * name-lookup.cc (append_imported_binding_slot): Propagate dups flags. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99242_a.H: New test. * g++.dg/modules/pr99242_b.H: New test. * g++.dg/modules/pr99242_c.H: New test. * g++.dg/modules/pr99242_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
8 hoursDaily bump.GCC Administrator8-1/+461
11 hoursrange-ops should return the requested boolean type.Andrew MacLeod1-25/+25
The pointer based relation operator's fold_range () routines should return a boolean range with the requested type, not the default type. PR tree-optimization/115951 * range-op-ptr.cc (operator_equal::fold_range): Return a boolean range with the requested type. (operator_not_equal::fold_range): Likewise. (operator_lt::fold_range): Likewise. (operator_le::fold_range): Likewise. (operator_gt::fold_range): Likewise. (operator_ge::fold_range): Likewise.
14 hoursc++/contracts: ICE in C++ Contracts with '-fno-exceptions' [PR 110159]Nina Ranns4-4/+50
We currently only initialise terminate_fn if exceptions are enabled. However, contract handling requires terminate_fn when building the contract because a contract failure may result in std::terminate call regardless of whether the exceptions are enabled. Refactored init_exception_processing to extract the initialisation of terminate_fn. New function init_terminate_fn added that initialises terminate_fn if it hasn't already been initialised. Call to terminate_fn added in cxx_init_decl_processing if contracts are enabled. PR c++/110159 gcc/cp/ChangeLog: * cp-tree.h (init_terminate_fn): Declaration of a new function. * decl.cc (cxx_init_decl_processing): If contracts are enabled, call init_terminate_fn. * except.cc (init_exception_processing): Function refactored to call init_terminate_fn. (init_terminate_fn): Added new function that initializes terminate_fn if it hasn't already been initialised. gcc/testsuite/ChangeLog: * g++.dg/contracts/pr110159.C: New test. Signed-off-by: Nina Ranns <dinka.ranns@gmail.com>
15 hoursAVR: testsuite - Attribute ipa implies noinline and noclone.Georg-Johann Lay43-123/+123
gcc/testsuite/ * gcc.target/avr/isr-test.h: Attribute ipa implies noinline and noclone. * gcc.target/avr/pr114981-powif.c: Same. * gcc.target/avr/pr114981-powil.c: Same. * gcc.target/avr/pr71676-1.c: Same. * gcc.target/avr/pr71676-2.c: Same. * gcc.target/avr/pr71676-3.c: Same. * gcc.target/avr/pr71676.c: Same. * gcc.target/avr/torture/add-extend.c: Same. * gcc.target/avr/torture/fix-types.h: Same. * gcc.target/avr/torture/fuse-add.c: Same. * gcc.target/avr/torture/get-mem.c: Same. * gcc.target/avr/torture/insv-anyshift-hi.c: Same. * gcc.target/avr/torture/insv-anyshift-si.c: Same. * gcc.target/avr/torture/isr-02-call.c: Same. * gcc.target/avr/torture/isr-03-fixed.c: Same. * gcc.target/avr/torture/pr109650-1.c: Same. * gcc.target/avr/torture/pr109650-2.c: Same. * gcc.target/avr/torture/pr109907-1.c: Same. * gcc.target/avr/torture/pr109907-2.c: Same. * gcc.target/avr/torture/pr114132-2.c: Same. * gcc.target/avr/torture/pr39633.c: Same. * gcc.target/avr/torture/pr51782-1.c: Same. * gcc.target/avr/torture/pr61055.c: Same. * gcc.target/avr/torture/pr61443.c: Same. * gcc.target/avr/torture/pr64331.c: Same. * gcc.target/avr/torture/pr77326.c: Same. * gcc.target/avr/torture/pr83729.c: Same. * gcc.target/avr/torture/pr83801.c: Same. * gcc.target/avr/torture/pr87376.c: Same. * gcc.target/avr/torture/pr88236-pr115726.c: Same. * gcc.target/avr/torture/pr92606.c: Same. * gcc.target/avr/torture/pr98762.c: Same. * gcc.target/avr/torture/sat-hr-plus-minus.c: Same. * gcc.target/avr/torture/sat-k-plus-minus.c: Same. * gcc.target/avr/torture/sat-llk-plus-minus.c: Same. * gcc.target/avr/torture/sat-r-plus-minus.c: Same. * gcc.target/avr/torture/sat-uhr-plus-minus.c: Same. * gcc.target/avr/torture/sat-uk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ullk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ur-plus-minus.c: Same. * gcc.target/avr/torture/set-mem.c: Same. * gcc.target/avr/torture/sub-extend.c: Same. * gcc.target/avr/torture/tiny-progmem.c: Same.
17 hoursc++, coroutines, contracts: Handle coroutine and void functions ↵Iain Sandoe10-120/+329
[PR110871,PR110872,PR115434]. The current implementation of contracts emits the checks into function bodies in three places; for pre-conditions at the start of the body, for asserts in-line in the function body and for post-conditions as an addition to return statements. In general (at least with existing "2a" contract semantics) the in-line contract asserts behave as expected. However, the mechanism is not applicable to: * Handling pre conditions in coroutines since, for those, the standard specifies a wrapping of the original function body by functionality implementing initial and final suspends (along with some housekeeping to route exceptions). Thus for such transformed function bodies, the preconditions then get actioned after the initial suspend, which does not behave as intended. * Handling post conditions in functions that do not have return statements (which applies to coroutines and void functions). In the following, we identify a potentially transformed function body (in the case of coroutines, this is usually called the "ramp()" function). The patch here re-implements the code insertion in one of the two following ways (code for exposition only): * For functions with no post-conditions we wrap the potentially transformed function as follows: { handle_pre_condition_checking (); potentially_transformed_function_body (); } This implements the intent that the preconditions are processed after the function parameters are initialised but before any other actions. * For functions with post-conditions: if (preconditions_exist) handle_pre_condition_checking (); try { potentially_transformed_function_body (); } finally { handle_post_condition_checking (); } else [only if the function is not marked noexcept(true) ] { ; } In this, post-conditions [that might apply to the return value etc.] are evaluated on every non-exceptional edge out of the function. At present, the model here is that exceptions thrown by the function propagate upwards as if there were no contracts present. If the desired semantic becomes that an exception is counted as equivalent to a contract violation - then we can add a second handler in place of the empty statement. This patch specifically does not address changes to code-gen and constexpr handling that are contained in P2900. PR c++/115434 PR c++/110871 PR c++/110872 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): Handle EH_ELSE_EXPR. * contracts.cc (finish_contract_attribute): Remove excess line. (build_contract_condition_function): Post condition handlers are void now. (emit_postconditions_cleanup): Remove. (emit_postconditions): New. (add_pre_condition_fn_call): New. (add_post_condition_fn_call): New. (apply_preconditions): New. (apply_postconditions): New. (maybe_apply_function_contracts): New. (apply_postcondition_to_return): Remove. * contracts.h (apply_postcondition_to_return): Remove. (maybe_apply_function_contracts): Add. * coroutines.cc (coro_build_actor_or_destroy_function): Do not copy contracts to coroutine helpers. * decl.cc (finish_function): Handle wrapping a possibly transformed function body in contract checks. * typeck.cc (check_return_expr): Remove handling of post conditions on return expressions. gcc/ChangeLog: * gimplify.cc (struct gimplify_ctx): Add a flag to show we are expending a handler. (gimplify_expr): When we are expanding a handler, and the body transforms might have re-written DECL_RESULT into a gimple var, ensure that hander references to DECL_RESULT are also re-written to refer to the gimple var. When we are processing an EH_ELSE expression, then add it if either of the cleanup slots is in use. gcc/testsuite/ChangeLog: * g++.dg/contracts/pr115434.C: New test. * g++.dg/coroutines/pr110871.C: New test. * g++.dg/coroutines/pr110872.C: New test. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
17 hoursAVR: testsuite - Add noipa function attribute to noclone functions.Georg-Johann Lay41-121/+121
Many functions under test have the noinline and noclone function attributes attached so that no (constant) values are propagated into the functions, so that we actually are testing what's supposed to be tested. In order to enforce that, noipa may also be required when inter-procedural analysis / optimizations are on. gcc/testsuite/ * gcc.target/avr/isr-test.h: Add noipa function attribute to noclone functions. * gcc.target/avr/pr114981-powif.c: Same. * gcc.target/avr/pr114981-powil.c: Same. * gcc.target/avr/pr71676-1.c: Same. * gcc.target/avr/pr71676-2.c: Same. * gcc.target/avr/pr71676-3.c: Same. * gcc.target/avr/pr71676.c: Same. * gcc.target/avr/torture/fix-types.h: Same. * gcc.target/avr/torture/fuse-add.c: Same. * gcc.target/avr/torture/get-mem.c: Same. * gcc.target/avr/torture/insv-anyshift-hi.c: Same. * gcc.target/avr/torture/insv-anyshift-si.c: Same. * gcc.target/avr/torture/isr-02-call.c: Same. * gcc.target/avr/torture/isr-03-fixed.c: Same. * gcc.target/avr/torture/pr109650-1.c: Same. * gcc.target/avr/torture/pr109650-2.c: Same. * gcc.target/avr/torture/pr109907-1.c: Same. * gcc.target/avr/torture/pr109907-2.c: Same. * gcc.target/avr/torture/pr114132-2.c: Same. * gcc.target/avr/torture/pr39633.c: Same. * gcc.target/avr/torture/pr51782-1.c: Same. * gcc.target/avr/torture/pr61055.c: Same. * gcc.target/avr/torture/pr61443.c: Same. * gcc.target/avr/torture/pr64331.c: Same. * gcc.target/avr/torture/pr77326.c: Same. * gcc.target/avr/torture/pr83729.c: Same. * gcc.target/avr/torture/pr83801.c: Same. * gcc.target/avr/torture/pr87376.c: Same. * gcc.target/avr/torture/pr88236-pr115726.c: Same. * gcc.target/avr/torture/pr92606.c: Same. * gcc.target/avr/torture/pr98762.c: Same. * gcc.target/avr/torture/sat-hr-plus-minus.c: Same. * gcc.target/avr/torture/sat-k-plus-minus.c: Same. * gcc.target/avr/torture/sat-llk-plus-minus.c: Same. * gcc.target/avr/torture/sat-r-plus-minus.c: Same. * gcc.target/avr/torture/sat-uhr-plus-minus.c: Same. * gcc.target/avr/torture/sat-uk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ullk-plus-minus.c: Same. * gcc.target/avr/torture/sat-ur-plus-minus.c: Same. * gcc.target/avr/torture/set-mem.c: Same. * gcc.target/avr/torture/tiny-progmem.c: Same.
18 hoursFortran: Simplify len_trim with array ref and fix mapping bug[PR84868].Paul Thomas3-6/+171
2024-07-16 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/84868 * simplify.cc (gfc_simplify_len_trim): If the argument is an element of a parameter array, simplify all the elements and build a new parameter array to hold the result, after checking that it doesn't already exist. * trans-expr.cc (gfc_get_interface_mapping_array) if a string length is available, use it for the typespec. (gfc_add_interface_mapping): Supply the se string length. gcc/testsuite/ PR fortran/84868 * gfortran.dg/pr84868.f90: New test.
18 hoursrtl-ssa: Fix removal of order_nodes [PR115929]Richard Sandiford2-1/+49
order_nodes are used to implement ordered comparisons between two insns with the same program point number. remove_insn would remove an order_node from its splay tree, but didn't remove it from the insn. This caused confusion if the insn was later reinserted somewhere else that also needed an order_node. gcc/ PR rtl-optimization/115929 * rtl-ssa/insns.cc (function_info::remove_insn): Remove an order_node from the instruction as well as from the splay tree. gcc/testsuite/ PR rtl-optimization/115929 * gcc.dg/torture/pr115929-1.c: New test.
18 hoursrecog: restrict paradoxical mode punning in insn_propagation [PR115901]Richard Sandiford2-0/+22
In g:44fc801e97a8dc626a4806ff4124439003420b20 I'd extended insn_propagation to handle simple cases of hard-reg mode punning. One of the checks was that the new use mode occupied the same number of registers as the original definition mode. However, as PR115901 shows, we need to avoid increasing the size of any registers in the punned "to" expression as well. Specifically, the test includes a DImode move from GPR x0 to a vector register, followed by a V2DI use of the vector register. The simplification would then create a V2DI spanning x0 and x1, manufacturing a new, unwanted use of x1. Checking for that kind of thing directly seems too cumbersome, and is not related to the original motivation (which was to improve handling of shared vector zeros on aarch64). This patch therefore restricts the paradoxical case to constants. gcc/ PR rtl-optimization/115901 * recog.cc (insn_propagation::apply_to_rvalue_1): Restrict paradoxical mode punning to cases where "to" is constant. gcc/testsuite/ PR rtl-optimization/115901 * gcc.dg/torture/pr115901.c: New test.
18 hoursrtl-ssa: Enforce earlyclobbers on hard-coded clobbers [PR115891]Richard Sandiford2-1/+69
The asm in the testcase has a memory operand and also clobbers ax. The clobber means that ax cannot be used to hold inputs, which extends to the address of the memory. I think I had an implicit assumption that constrain_operands would enforce this, but in hindsight, that clearly wasn't going to be true. constrain_operands only looks at constraints, and these clobbers are by definition outside the constraint system. (And that's why they have to be handled conservatively, since there's no way to distinguish the earlyclobber and non-earlyclobber cases.) The semantics of hard-coded clobbers are generic enough that I think they should be handled directly by rtl-ssa, rather than by consumers. And in the context of rtl-ssa, the easiest way to check for a clash is to walk the list of input registers, which we already have to hand. It therefore seemed better not to push this down to a more generic rtl helper. The patch detects hard-coded clobbers in the same way as regrename: by temporarily stubbing out the operands with pc_rtx. gcc/ PR rtl-optimization/115891 * rtl-ssa/changes.cc (find_clobbered_access): New function. (recog_level2): Use it to check for overlap between input registers and hard-coded clobbers. Conditionally reset recog_data.insn after changing the insn code. gcc/testsuite/ PR rtl-optimization/115891 * gcc.target/i386/pr115891.c: New test.
18 hoursAVR: Overhaul add and sub insns that extend one operand.Georg-Johann Lay5-214/+377
These are insns of the forms (set (regA:M) (plus:M (extend:M (regB:L)) (regA:M))) and (set (regA:M) (minus:M (regA:M) (extend:M (regB:L)))) where "extend" may be a sign-extend or zero-extend, and the integer modes are SImode >= M > L >= QImode. The existing patterns are now represented in terms of insns with mode iterators and a code iterator over any_extend, and these new insn support all valid combinations of M and L (which previously was not the case). gcc/ * config/avr/avr.cc (avr_out_minus): Assimilate into... (avr_out_plus_ext): ...this new function. (avr_adjust_insn_length) [ADJUST_LEN_PLUS_EXT]: Handle case. (avr_rtx_costs_1) [PLUS, MINUS]: Adjust RTX costs. * config/avr/avr.md (adjust_len) <plus_ext>: Add new attribute value. (*addpsi3_zero_extend.hi_split): Assimilate... (*addpsi3_zero_extend.qi_split): Assimilate... (*addsi3_zero_extend_split): Assimilate... (*addsi3_zero_extend.hi_split): Assimilate... (*addpsi3_sign_extend.hi_split): Assimilate... (*addhi3.sign_extend1_split): Assimilate... (*add<PSISI:mode>3.<code>.<QIPSI:mode>_split): ...into this new insn-and-split. (*addpsi3_zero_extend.hi): Assimilate... (*addpsi3_zero_extend.qi): Assimilate... (*addsi3_zero_extend): Assimilate... (*addsi3_zero_extend.hi): Assimilate... (*addpsi3_sign_extend.hi): Assimilate... (*addhi3.sign_extend1): Assimilate... (*add<PSISI:mode>3.<code>.<QIPSI:mode>): ...into this new insn. (*subpsi3_sign_extend.hi_split): Assimilate... (*subhi3.sign_extend2_split): Assimilate... (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>_split): Assimilate... (*sub<HISI:mode>3.<code><QIPSI:mode>_split): ...into this new insn-and-split. (*subpsi3_sign_extend.hi): Assimilate... (*subhi3.sign_extend2): Assimilate... (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>): Assimilate... (*sub<HISI:mode>3.<code>.<QIPSI:mode>): ...into this new insn. (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>): Use avr_out_plus_ext for asm out. * config/avr/avr-protos.h (avr_out_minus): Remove. (avr_out_plus_ext): New proto. gcc/testsuite/ * gcc.target/avr/torture/add-extend.c: New test. * gcc.target/avr/torture/sub-extend.c: New test.
18 hoursPR modula2/115957 ICE on procedure local const declarationGaius Mulley6-48/+90
An ICE would occur if a constant was declared using a variable term. This fix catches variable terms in constant expressions and generates an unrecoverable error. gcc/m2/ChangeLog: PR modula2/115957 * gm2-compiler/M2StackAddress.mod (PopAddress): Detect tail=NIL and generate an internal error. * gm2-compiler/PCBuild.bnf (InConstParameter): New variable. (InConstBlock): New variable. (ErrorString): Rewrite using MetaErrorStringT0. (ErrorArrayAt): Rewrite using MetaErrorStringT0. (WarnMissingToken): Use MetaErrorStringT0. (CompilationUnit): Set seenError FALSE. (init): Initialize InConstParameter and InConstBlock. (ConstantDeclaration): Set InConstBlock. (ConstSetOrQualidentOrFunction): Call CheckNotVar if not InConstParameter and InConstBlock. (ConstActualParameters): Set InConstParameter TRUE and restore value at the end. * gm2-compiler/PCSymBuild.def (CheckNotVar): New procedure. Remove all unnecessary export qualified list. * gm2-compiler/PCSymBuild.mod (CheckNotVar): New procedure. gcc/testsuite/ChangeLog: PR modula2/115957 * gm2/errors/fail/badconst.mod: New test. * gm2/pim/fail/tinyadr.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
18 hoursLower zeroing array assignment to memset for allocatable arrays.Prathamesh Kulkarni2-10/+73
gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_zero_assign): Handle allocatable arrays. gcc/testsuite/ChangeLog: * gfortran.dg/array_memset_3.f90: New test. Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
19 hourstree-optimization/115841 - reduction epilogue placement issueRichard Biener2-3/+46
When emitting the compensation to the vectorized main loop for a vector reduction value to be re-used in the vectorized epilogue we fail to place it in the correct block when the main loop is known to be entered (no loop_vinfo->main_loop_edge) but the epilogue is not (a loop_vinfo->skip_this_loop_edge). The code currently disregards this situation. With the recent znver4 cost fix I couldn't trigger this situation with the testcase but I adjusted it so it could eventually trigger on other targets. PR tree-optimization/115841 * tree-vect-loop.cc (vect_transform_cycle_phi): Correctly place the partial vector reduction for the accumulator re-use when the main loop cannot be skipped but the epilogue can. * gcc.dg/vect/pr115841.c: New testcase.
19 hoursAVR: Allow more combinations of XOR / IOR with byte-shifts.Georg-Johann Lay3-27/+104
This patch takes some existing patterns that have QImode as one input and uses a mode iterator to allow for more modes to match. These insns are split after reload into *xorqi3 resp. *iorqi3 insn(s). gcc/ * config/avr/avr-protos.h (avr_emit_xior_with_shift): New proto. * config/avr/avr.cc (avr_emit_xior_with_shift): New function. * config/avr/avr.md (any_lshift): New code iterator. (*<xior:code><mode>.<any_lshift:code>): New insn-and-split. (<code><HISI:mode><QIPSI:mode>.0): Replaces... (*<code_stdname><mode>qi.byte0): ...this one. (*<xior:code><HISI:mode><QIPSI:mode>.<any_lshift:code>): Replaces... (*<code_stdname><mode>qi.byte1-3): ...this one.
20 hourslibiberty/buildargv: handle input consisting of only white spaceAndrew Burgess2-78/+166
GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and noticed some unexpected behaviour to the libiberty function buildargv, when the input is a string consisting only of white space. What I observe is that if the input to buildargv is a string containing only white space, then buildargv will return an argv list containing a single empty argument, e.g.: char **argv = buildargv (" "); assert (*argv[0] == '\0'); assert (argv[1] == NULL); We get the same output from buildargv if the input is a single space, or multiple spaces. Other white space characters give the same results. This doesn't seem right to me, and in fact, there appears to be a work around for this issue in expandargv where we have this code: /* If the file is empty or contains only whitespace, buildargv would return a single empty argument. In this context we want no arguments, instead. */ if (only_whitespace (buffer)) { file_argv = (char **) xmalloc (sizeof (char *)); file_argv[0] = NULL; } else /* Parse the string. */ file_argv = buildargv (buffer); I think that the correct behaviour in this situation is to return an empty argv array, e.g.: char **argv = buildargv (" "); assert (argv[0] == NULL); And it turns out that this is a trivial change to buildargv. The diff does look big, but this is because I've re-indented a block. Check with 'git diff -b' to see the minimal changes. I've also removed the work around from expandargv. When testing this sort of thing I normally write the tests first, and then fix the code. In this case test-expandargv.c has sort-of been used as a mechanism for testing the buildargv function (expandargv does call buildargv most of the time), however, for this particular issue the work around in expandargv (mentioned above) masked the buildargv bug. I did consider adding a new test-buildargv.c file, however, this would have basically been a copy & paste of test-expandargv.c (with some minor changes to call buildargv). This would be fine now, but feels like we would eventually end up with one file not being updated as much as the other, and so test coverage would suffer. Instead, I have added some explicit buildargv testing to the test-expandargv.c file, this reuses the test input that is already defined for expandargv. Of course, once I removed the work around from expandargv then we now do always call buildargv from expandargv, and so the bug I'm fixing would impact both expandargv and buildargv, so maybe the new testing is redundant? I tend to think more testing is always better, so I've left it in for now. 2024-07-16 Andrew Burgess <aburgess@redhat.com> libiberty/ * argv.c (buildargv): Treat input of only whitespace as an empty argument list. (expandargv): Remove work around for intput that is only whitespace. * testsuite/test-expandargv.c: Add new tests 10, 11, and 12. Extend testing to call buildargv in more cases.
20 hourslibiberty/buildargv: POSIX behaviour for backslash handlingAndrew Burgess2-2/+40
GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and have tracked done some of the unexpected behaviour to the libiberty function buildargv, and how it handles backslash escapes. For reference, I've been mostly reading: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html The issues that I would like to fix are: 1. Backslashes within single quotes should not be treated as an escape, thus: '\a' should split to \a, retaining the backslash. 2. Backslashes within double quotes should only act as an escape if they are immediately before one of the characters $ (dollar), ` (backtick), " (double quote), ` (backslash), or \n (newline). In all other cases a backslash should not be treated as an escape character. Thus: "\a" should split to \a, but "\$" should split to $. 3. A backslash-newline sequence should be treated as a line continuation, both the backslash and the newline should be removed. I've updated libiberty and also added some tests. All the existing libiberty tests continue to pass, but I'm not sure if there is more testing that should be done, buildargv is used within lto-wraper.cc, so maybe there's some testing folk can suggest that I run? 2024-07-16 Andrew Burgess <aburgess@redhat.com> libiberty/ * argv.c (buildargv): Backslashes within single quotes are literal, backslashes only escape POSIX defined special characters within double quotes, and backslashed newlines should act as line continuations. * testsuite/test-expandargv.c: Add new tests 7, 8, and 9.
23 hoursi386, testsuite: Fix non-Unicode characterPaul-Antoine Arras1-1/+1
gcc/testsuite/ChangeLog: * gcc.target/i386/indirect-thunk-extern-1.c: Replace character with invalid encoding with `?`.
23 hourss390: Fix unresolved iterators bhfgq and xdeeStefan Schulze Frielinghaus3-7/+2
Code attribute bhfgq is missing a mapping for TF. This results in unresolved iterators in assembler templates for *bswaptf. With the TF mapping added the base mnemonics vlbr and vstbr are not "used" anymore but only the extended mnemonics (vlbr<bhfgq> was interpreted as vlbr; likewise for vstbr). Therefore, remove the base mnemonics from the scheduling description, otherwise, genattrtab would error about unknown mnemonics. Similarly, we end up with unresolved iterators in assembler templates for mulfprx23 since code attribute xdee is missing a mapping for FPRX2. gcc/ChangeLog: * config/s390/3931.md (vlbr, vstbr): Remove. * config/s390/s390.md (xdee): Add FPRX2 mapping. * config/s390/vector.md (bhfgq): Add TF mapping.
24 hoursFixup unaligned load/store cost for znver5Richard Biener1-2/+2
Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply copied from the bogus znver4 costs. The following makes the unaligned costs equal to the aligned costs like in the fixed znver4 version. * config/i386/x86-tune-costs.h (znver5_cost): Update unaligned load and store cost from the aligned costs.
24 hourss390: Drop vcond{,u} expandersStefan Schulze Frielinghaus1-35/+0
Optabs vcond{,u} will be removed for GCC 15. Since regtest shows no fallout, dropping the expanders, now. gcc/ChangeLog: PR target/114189 * config/s390/vector.md (V_HW2): Remove. (vcond<V_HW:mode><V_HW2:mode>): Remove. (vcondu<V_HW:mode><V_HW2:mode>): Remove.
24 hourss390: Enable vcond_mask for 128-bit opsStefan Schulze Frielinghaus1-4/+4
In preparation of dropping vcond{,u,eq} optabs https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654690.html enable 128-bit operands for vcond_mask---including integer as well as floating point. This fixes partially PR115519 w.r.t. autovec-long-double-signaling-*.c tests. gcc/ChangeLog: * config/s390/vector.md: Enable vcond_mask for 128-bit ops.
24 hourss390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integersStefan Schulze Frielinghaus4-12/+171
Mode iterator V_HW enables V1TI for target VXE which means vec_cmpv1tiv1ti becomes available which leads to an ICE since there is no corresponding insn. Fixed by emulating comparisons and enabling mode V1TI unconditionally for V_HW. For the sake of symmetry, I also added TI mode to V_HW since TF mode is already included. As a consequence the consumers of V_HW vec_{splat,slb,sld,sldw,sldb,srdb,srab,srb,test_mask_int,test_mask} also become available for 128-bit integers. This fixes gcc.c-torture/execute/pr105613.c and gcc.dg/pr106063.c. gcc/ChangeLog: * config/s390/vector.md (V_HW): Enable V1TI unconditionally and add TI. (vec_cmpu<VIT_HW:mode><VIT_HW:mode>): Add 128-bit integer variants. (*vec_cmpeq<mode><mode>_nocc_emu): Emulate operation. (*vec_cmpgt<mode><mode>_nocc_emu): Emulate operation. (*vec_cmpgtu<mode><mode>_nocc_emu): Emulate operation. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-cmp-emu-1.c: New test. * gcc.target/s390/vector/vec-cmp-emu-2.c: New test. * gcc.target/s390/vector/vec-cmp-emu-3.c: New test.
24 hourstree-optimization/115843 - fix wrong-code with fully-masked loop and peelingRichard Biener2-2/+47
When AVX512 uses a fully masked loop and peeling we fail to create the correct initial loop mask when the mask is composed of multiple components in some cases. The following fixes this by properly applying the bias for the component to the shift amount. PR tree-optimization/115843 * tree-vect-loop-manip.cc (vect_set_loop_condition_partial_vectors_avx512): Properly bias the shift of the initial mask for alignment peeling. * gcc.dg/vect/pr115843.c: New testcase.
25 hoursFixup unaligned load/store cost for znver4Richard Biener1-2/+2
Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply left untouched from znver3 where they equate the aligned costs when tweaking aligned costs for znver4. The following makes the unaligned costs equal to the aligned costs. This avoids the miscompile seen in PR115843 but it's of course not a real fix for the issue uncovered there. But it makes it qualify as a regression fix. PR tree-optimization/115843 * config/i386/x86-tune-costs.h (znver4_cost): Update unaligned load and store cost from the aligned costs.
26 hoursPR tree-optimization/114661: Generalize MULT_EXPR recognition in match.pd.Roger Sayle2-17/+36
This patch resolves PR tree-optimization/114661, by generalizing the set of expressions that we canonicalize to multiplication. This extends the optimization(s) contributed (by me) back in July 2021. https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575999.html The existing transformation folds (X*C1)^(X<<C2) into X*C3 when allowed. A subtlety is that for non-wrapping integer types, we actually fold this into (int)((unsigned)X*C3) so that we don't introduce an undefined overflow that wasn't in the original. Unfortunately, this transformation confuses itself, as the type-cast multiplication isn't recognized when further combining bit operations. Fixed here by allowing optional useless type conversions in transforms to turn (int)((unsigned)X*C1)^(X<<C2) into (int)((unsigned)X*C3) so that match.pd and EVRP can continue to construct multiplications. For the example given in the PR: unsigned mul(unsigned char c) { if (c > 3) __builtin_unreachable(); return c << 18 | c << 15 | c << 12 | c << 9 | c << 6 | c << 3 | c; } GCC on x86_64 with -O2 previously generated: mul: movzbl %dil, %edi leal (%rdi,%rdi,8), %edx leal 0(,%rdx,8), %eax movl %edx, %ecx sall $15, %edx orl %edi, %eax sall $9, %ecx orl %ecx, %eax orl %edx, %eax ret with this patch we now generate: mul: movzbl %dil, %eax imull $299593, %eax, %eax ret 2024-07-16 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2) to X*(C1+C2)): Allow optional useless type conversions around multiplications, such as those inserted by this transformation. gcc/testsuite/ChangeLog PR tree-optimization/114661 * gcc.dg/pr114661.c: New test case.
31 hoursi386: extend trunc{128}2{16,32,64}'s scope.Hu, Lin12-5/+47
Based on actual usage, trunc{128}2{16,32,64} use some instructions from sse/sse3, so extend their scope to extend the scope of optimization. gcc/ChangeLog: PR target/107432 * config/i386/sse.md (PMOV_SRC_MODE_3_AVX2): Add TARGET_AVX2 for V4DI and V8SI. (PMOV_SRC_MODE_4): Add TARGET_AVX2 for V4DI. (trunc<mode><pmov_dst_3_lower>2): Change constraint from TARGET_AVX2 to TARGET_SSSE3. (trunc<mode><pmov_dst_4_lower>2): Ditto. (truncv2div2si2): Change constraint from TARGET_AVX2 to TARGET_SSE. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-10.c: New test.
32 hourslibbacktrace: support FDPICIan Lance Taylor5-62/+123
Based on patch by Max Filippov. * internal.h: If FDPIC, #include <link.h> and/or <sys/link.h>. (libbacktrace_using_fdpic): Define. (struct libbacktrace_base_address): Define. (libbacktrace_add_base): Define. (backtrace_dwarf_add): Change base_address to struct libbacktrace_base_address. * dwarf.c (struct dwarf_data): Change base_address to struct libbacktrace_base_address. (add_ranges, find_address_ranges, build_ddress_map): Likewise. (build_dwarf_data, build_dwarf_add): Likewise. (add_low_high_range): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (add_ranges_from_ranges, add_ranges_from_rnglists): Likewise. (add_line): Use libbacktrace_add_base. * elf.c (elf_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (elf_add): Change base_address to struct libbacktrace_base_address. (phdr_callback): Likewise. Initialize base_address.m. (backtrace_initialize): If using FDPIC, don't call elf_add with main executable; always use dl_iterate_phdr. * macho.c (macho_add_symtab): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (macho_syminfo): Change base_address to struct libbacktrace_base_address. (macho_add_fat, macho_add_dsym, macho_add): Likewise. (backtrace_initialize): Likewise. Initialize base_address.m. * pecoff.c (coff_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (coff_add): Change base_address to struct libbacktrace_base_address. Initialize base_address.m.
32 hoursDaily bump.GCC Administrator4-1/+356
32 hoursFix liveness computation for shift/rotate counts in ext-dceJeff Law1-3/+4
So as I've noted before I believe the control flow in ext-dce.cc is horribly messy. While investigating a fix for 115877 I came across another problem related to control flow handling. Specifically, if we have an binary op which implies the 2nd operand is fully live, then we'd actually fail to mark that operand as live. We essentially broke out of the loop which was supposed to be safe. But Y was a REG and if Y is a REG or CONST_INT we skip sub-rtxs and thus failed to process that operand (the shift count) at all. Rather than muck around with control flow, we can just set all the bits as live in DST_MASK and let normal processing continue. With all the bits live IN DST_MASK all the bits implied by the mode of the argument will also be live. No testcase. Bootstrapped and regression tested on x86. Pushing to the trunk. gcc/ * ext-dce.cc (ext_dce_process_uses): Simplify control flow and fix liveness computation for shift/rotate counts.
34 hoursFix sign/carry bit handling in ext-dce.Jeff Law2-2/+92
My change to fix a ubsan issue broke handling propagation of the carry/sign bit down through a right shift. Thanks to Andreas for the analysis and proposed fix and Sergei for the testcase. PR rtl-optimization/115876 PR rtl-optimization/115916 gcc/ * ext-dce.cc (carry_backpropagate): Make return type unsigned as well. Cast to signed for right shift to preserve sign bit. gcc/testsuite/ * g++.dg/torture/pr115916.C: New test. Co-author: Andreas Schwab <schwab@linux-m68k.org> Co-author: Sergei Trofimovich <slyfox at gentoo dot org>
35 hoursc++: alias template with dependent attributes [PR115897]Patrick Palka2-0/+42
Here we're prematurely stripping the dependent alias template-id A<T> to its defining-type-id T when used as a template argument, which in turn causes us to essentially ignore A's vector_size attribute in the outer template-id. This has always been a problem for class template-ids it seems, and after r14-2170 variable template-ids are affected as well. This patch marks alias templates that have a dependent attribute as complex (as with e.g. constrained alias templates) so that we don't look through them prematurely. PR c++/115897 gcc/cp/ChangeLog: * pt.cc (complex_alias_template_p): Return true for an alias template with attributes. (get_underlying_template): Don't look through an alias template with attributes. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/alias-decl-77.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
35 hoursRevert "RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr"Christoph Müllner1-3/+6
This reverts commit 5040c273484d7123a40a99cdeb434cecbd17a2e9.
40 hoursRISC-V: Allow adding enabled extension via target arch attributesChristoph Müllner5-8/+47
The set of enabled extensions can be extended via target arch function attributes by listing each extension with a '+' prefix and a comma as list separator. E.g.: __attribute__((target("arch=+zba,+zbb"))) void foo(); The programmer intends to ensure that one or more extensions are enabled when building the code. This is independent of the arch string that is passed at build time via the -march= option. Therefore, it is reasonable to allow enabling extensions via target arch attributes, which have already been enabled via the -march= string. The subset list code already supports such duplication for implied extensions. This patch adds an interface so the subset list parser can be switched into a mode where duplication is allowed. This commit fixes the following regressed test cases: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::add): Allow adding enabled extension if m_allow_adding_dup is set. * config/riscv/riscv-subset.h: Add m_allow_adding_dup and setter. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Allow adding enabled extensions. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr115554.c: Change expected fail to expected pass. * gcc.target/riscv/target-attr-16.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
40 hoursRISC-V: Rewrite target attribute handlingChristoph Müllner23-203/+371
The target-arch attribute handling in RISC-V is only a few months old, but already saw a rewrite (9941f0295a14), which addressed an important issue. This rewrite introduced a hash table in the backend, which is used to keep track of target-arch attributes of all functions. The index of this hash table is the pointer to the function declaration object (fndecl). However, objects like these don't have the lifetime that is assumed here, which resulted in observing two fndecl objects with the same address for different objects (triggering the assertion in riscv_func_target_put() -- see also PR115562). This patch removes the hash table approach in favor of storing target specific options using the DECL_FUNCTION_SPECIFIC_TARGET() macro, which is also used by other backends and is specifically designed for this purpose (https://gcc.gnu.org/onlinedocs/gccint/Function-Properties.html). To have an accessible field in the target options, we need to adjust riscv.opt and introduce the field riscv_arch_string (for the already existing option '-march='). Using this macro allows to remove much code from riscv-common.cc, which controls access to the objects 'func_target_table' and 'current_subset_list'. One thing to mention is, that we had two subset lists: current_subset_list and cmdline_subset_list, with the latter being introduced recently for target attribute handling. This patch reduces them back to one (cmdline_subset_list) which contains the list of extensions that have been enabled by the command line arguments. Note that the patch keeps the existing behavior of rejecting duplications of extensions when added via the '+' operator in a function target attribute. E.g. "-march=rv64gc_zbb" and "arch=+zbb" will trigger an error (see pr115554.c). However, at the same time this patch breaks the acceptance of adding implied extensions, which causes the following six regressions (with the error "extension 'EXT' appear more than one time"): * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c New tests were added to document the behavior and to ensure it won't regress. This patch did not show any regressions for rv32/rv64 and fixes the ICEs from PR115554 and PR115562. PR target/115554 PR target/115562 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (struct riscv_func_target_info): Remove. (struct riscv_func_target_hasher): Likewise. (riscv_func_decl_hash): Likewise. (riscv_func_target_hasher::hash): Likewise. (riscv_func_target_hasher::equal): Likewise. (riscv_current_subset_list): Likewise. (riscv_cmdline_subset_list): Remove obsolete space. (riscv_func_target_table_lazy_init): Remove. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. (riscv_arch_str): Generate from cmdline_subset_list. (riscv_set_arch_by_subset_list): Don't set current_subset_list. (riscv_parse_arch_string): Remove current_subset_list. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Get subset list via riscv_cmdline_subset_list(). * config/riscv/riscv-subset.h (riscv_current_subset_list): Remove prototype. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Build base arch string from existing target options, if any. (riscv_target_attr_parser::update_settings): Store new arch string in target options. (riscv_process_one_target_attr): Whitespace fix. (riscv_process_target_attr): Drop opts argument. (riscv_option_valid_attribute_p): Properly save, change and restore target options. * config/riscv/riscv.cc (get_arch_str): New function. (riscv_declare_function_name): Get arch string for option-arch directive from function's target options. * config/riscv/riscv.opt: Add riscv_arch_string variable to march option. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: Add test for option-arch directive. * gcc.target/riscv/target-attr-02.c: Likewise. * gcc.target/riscv/target-attr-03.c: Likewise. * gcc.target/riscv/target-attr-04.c: Likewise. * gcc.target/riscv/target-attr-05.c: Fix formatting. * gcc.target/riscv/target-attr-06.c: Likewise. * gcc.target/riscv/target-attr-07.c: Likewise. * gcc.target/riscv/pr115554.c: New test. * gcc.target/riscv/pr115562.c: New test. * gcc.target/riscv/target-attr-08.c: New test. * gcc.target/riscv/target-attr-09.c: New test. * gcc.target/riscv/target-attr-10.c: New test. * gcc.target/riscv/target-attr-11.c: New test. * gcc.target/riscv/target-attr-12.c: New test. * gcc.target/riscv/target-attr-13.c: New test. * gcc.target/riscv/target-attr-14.c: New test. * gcc.target/riscv/target-attr-15.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
40 hoursRISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptrChristoph Müllner1-6/+3
Allocating an object on the heap with new, wrapping it in a std::unique_ptr and finally getting the buffer via buf.get() is a correct way to allocate a buffer that is automatically freed on return. However, a simple invocation of alloca() does the same with less overhead. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Replace new + std::unique_ptr by alloca(). (riscv_process_one_target_attr): Likewise. (riscv_process_target_attr): Likewise. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
40 hours[i386] adjust flag_omit_frame_pointer in a single function [PR113719]Alexandre Oliva1-6/+6
The first two patches for PR113719 have each regressed gcc.dg/ipa/iinline-attr.c on a different target. The reason for this instability is that there are competing flag_omit_frame_pointer overriders on x86: - ix86_recompute_optlev_based_flags computes and sets a -f[no-]omit-frame-pointer default depending on USE_IX86_FRAME_POINTER and, in 32-bit mode, optimize_size - ix86_option_override_internal enables flag_omit_frame_pointer for -momit-leaf-frame-pointer to take effect ix86_option_override[_internal] calls ix86_recompute_optlev_based_flags before setting flag_omit_frame_pointer. It is called during global process_options. But ix86_recompute_optlev_based_flags is also called by parse_optimize_options, during attribute processing, and at that point, ix86_option_override is not called, so the final overrider for global options is not applied to the optimize attributes. If they differ, the testcase fails. In order to fix this, we need to process all overriders of this option whenever we process any of them. Since this setting is affected by optimization options, it makes sense to compute it in parse_optimize_options, rather than in process_options. for gcc/ChangeLog PR target/113719 * config/i386/i386-options.cc (ix86_option_override_internal): Move flag_omit_frame_pointer final overrider... (ix86_recompute_optlev_based_flags): ... here.
40 hoursRISC-V: Fix testcase for vector .SAT_SUB in zip benchmarkEdwin Lu1-0/+1
The following testcase was not properly testing anything due to an uninitialized variable. As a result, the loop was not iterating through the testing data, but instead on undefined values which could cause an unexpected abort. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: initialize variable Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
2 daysAVR: avr-md - Simplify GET_MODE and GET_MODE_BITSIZE.Georg-Johann Lay2-22/+22
gcc/ * config/avr/avr.md: Simplify mode usage. (GET_MODE_SIZE (<MODE>mode)): Use <SIZE> instead. (GET_MODE_BITSIZE (<MODE>mode) - 1): Use <MSB> instead. (GET_MODE_MASK (QImode)): Use 0xff instead. * config/avr/avr-fixed.md: Same.