aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-11-18Daily bump.GCC Administrator3-1/+18
2022-11-18Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus13-5/+260
Merge up to r12-8916-g14faa5f585f6025df1e04c8c8b34340ff5e4d494 (18th Nov 2022)
2022-11-18gcn: Add __builtin_gcn_kernarg_ptrTobias Burnus3-4/+36
Add __builtin_gcn_kernarg_ptr to avoid using hard-coded register values and permit future ABI changes while keeping the API. gcc/ChangeLog: * config/gcn/gcn-builtins.def (KERNARG_PTR): Add. * config/gcn/gcn.cc (gcn_init_builtin_types): Change siptr_type_node, sfptr_type_node and voidptr_type_node from FLAT to ADDR_SPACE_DEFAULT. (gcn_expand_builtin_1): Handle GCN_BUILTIN_KERNARG_PTR. (gcn_oacc_dim_size): Return in ADDR_SPACE_FLAT. libgomp/ChangeLog: * config/gcn/team.c (gomp_gcn_enter_kernel): Use __builtin_gcn_kernarg_ptr instead of asm ("s8"). Co-Authored-By: Andrew Stubbs <ams@codesourcery.com> (cherry picked from commit 6f83861cc1c4d09425aa6539877bfa50ef90f183)
2022-11-17c++: constinit on pointer to function [PR104066]Marek Polacek2-1/+13
[dcl.constinit]: "The constinit specifier shall be applied only to a declaration of a variable with static or thread storage duration." Thus, this ought to be OK: constinit void (*p)() = nullptr; but the error message I introduced when implementing constinit was not looking at funcdecl_p, so the code above was rejected. Fixed thus. I'm checking constinit_p first because I think that's far more likely to be false than funcdecl_p. PR c++/104066 gcc/cp/ChangeLog: * decl.cc (grokdeclarator): Check funcdecl_p before complaining about constinit. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constinit18.C: New test. (cherry picked from commit 7b3b2f50953c5143d4b14b59d322d8a793f411dd)
2022-11-17Daily bump.GCC Administrator3-1/+35
2022-11-16aarch64: Add support for Ampere-1A (-mcpu=ampere1a) CPUPhilipp Tomsich6-3/+178
This patch adds support for Ampere-1A CPU: - recognize the name of the core and provide detection for -mcpu=native, - updated extra_costs, - adds a new fusion pair for (A+B+1 and A-B-1). Ampere-1A and Ampere-1 have more timing difference than the extra costs indicate, but these don't propagate through to the headline items in our extra costs (e.g. the change in latency for scalar sqrt doesn't have a corresponding table entry). gcc/ChangeLog: * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add ampere1a. * config/aarch64/aarch64-cost-tables.h: Add ampere1a_extra_costs. * config/aarch64/aarch64-fusion-pairs.def (AARCH64_FUSION_PAIR): Define a new fusion pair for A+B+1/A-B-1 (i.e., add/subtract two registers and then +1/-1). * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Implement idiom-matcher for the new fusion pair. * doc/invoke.texi: Add ampere1a. (cherry picked from commit 590a06afbf0e96813b5879742f38f3665512c854)
2022-11-16SRA: Limit replacement creation for accesses propagated from LHSsMartin Jambor2-0/+34
PR 107206 is fallout from the fix to PR 92706 where we started propagating accesses across assignments also from LHS to RHS of assignments so that we would not do harmful total scalarization of the aggregates on the RHS. But this can lead to new scalarization of these aggregates and in the testcase of PR 107206 these can appear in superfluous uses of un-initialized values and spurious warnings. Fixed by making sure the the accesses created by propagation in this direction are only used as a basis for replacements when the structure would be totally scalarized anyway. gcc/ChangeLog: 2022-10-18 Martin Jambor <mjambor@suse.cz> PR tree-optimization/107206 * tree-sra.cc (struct access): New field grp_result_of_prop_from_lhs. (analyze_access_subtree): Do not create replacements for accesses with this flag when not toally scalarizing. (propagate_subaccesses_from_lhs): Set the new flag. gcc/testsuite/ChangeLog: 2022-10-18 Martin Jambor <mjambor@suse.cz> PR tree-optimization/107206 * g++.dg/tree-ssa/pr107206.C: New test. (cherry picked from commit f6c168f8c06047bfaa3005e570126831b8855dcc)
2022-11-16nvptx/mkoffload.cc: Fix "$nohost" checkTobias Burnus2-2/+12
If lhd_set_decl_assembler_name is invoked - in particular if !TREE_PUBLIC (decl) && !DECL_FILE_SCOPE_P (decl) - the '.nohost' suffix might change to '.nohost.2'. This happens for the existing reverse offload testcases via cgraph_node::analyze and is a side effect of r13-3455-g178ac530fe67e4f2fc439cc4ce89bc19d571ca31 for some reason. The solution is to not only check for a tailing '$nohost' but also for '$nohost$' in nvptx/mkoffload.cc. gcc/ChangeLog: * config/nvptx/mkoffload.cc (process): Recognize '$nohost$...' besides tailing '$nohost' as being for reverse offload. (cherry picked from commit d59858f6ee7f356f27ccc2d29129826781f9483f)
2022-11-16Daily bump.GCC Administrator1-1/+1
2022-11-15Daily bump.GCC Administrator1-1/+1
2022-11-14Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus16-43/+204
Merge up to r12-8907-g58da1386d2233b8e01aaac8f7c4a61a2ccf52743 (14th Nov 2022)
2022-11-14Daily bump.GCC Administrator1-1/+1
2022-11-13Daily bump.GCC Administrator1-1/+1
2022-11-12Daily bump.GCC Administrator1-1/+1
2022-11-11Daily bump.GCC Administrator1-1/+1
2022-11-10Daily bump.GCC Administrator3-1/+10
2022-11-09Add guality testcase for RTL alias analysis fixEric Botcazou1-0/+20
gcc/testsuite/ * gcc.dg/guality/param-6.c: New test.
2022-11-09Restore RTL alias analysis for hard frame pointerEric Botcazou1-7/+12
The change: 2021-07-28 Bin Cheng <bin.cheng@linux.alibaba.com> alias.c (init_alias_analysis): Don't skip prologue/epilogue. broke the alias analysis for the hard frame pointer (when it is used as a frame pointer, i.e. when the frame pointer is not eliminated) described in the large comment at the top of the file, because static_reg_base_value is set for it and, consequently, new_reg_base_value too. When the instruction saving the stack pointer into the hard frame pointer in the prologue is processed, it is viewed as a second set of the hard frame pointer and to a different value by record_set, which then proceeds to reset new_reg_base_value to 0 and the game is over. gcc/ * alias.cc (init_alias_analysis): Do not record sets to the hard frame pointer if the frame pointer has not been eliminated.
2022-11-09Daily bump.GCC Administrator3-1/+18
2022-11-08Always use TYPE_MODE instead of DECL_MODE for vector fieldH.J. Lu2-2/+40
e034c5c8957 re PR target/78643 (ICE in convert_move, at expr.c:230) fixed the case where DECL_MODE of a vector field is BLKmode and its TYPE_MODE is a vector mode because of target attribute. Remove the BLKmode check for the case where DECL_MODE of a vector field is a vector mode and its TYPE_MODE isn't a vector mode because of target attribute. gcc/ PR target/107304 * expr.cc (get_inner_reference): Always use TYPE_MODE for vector field with vector raw mode. gcc/testsuite/ PR target/107304 * gcc.target/i386/pr107304.c: New test. (cherry picked from commit 1c64aba8cdf6509533f554ad86640f274cdbe37f)
2022-11-08Daily bump.GCC Administrator2-1/+8
2022-11-07amdgcn: Fix expansion of GCN_BUILTIN_LDEXPV builtinKwok Cheung Yeung2-1/+6
2022-11-07 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.cc (gcn_expand_builtin_1): Expand first argument of GCN_BUILTIN_LDEXPV to V64DFmode.
2022-11-07Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDSCui,Lili3-16/+12
gcc/ChangeLog: * config/i386/driver-i386.cc (host_detect_local_cpu): Move sapphirerapids out of AVX512_VP2INTERSECT. * config/i386/i386.h: Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS * doc/invoke.texi: Remove AVX512_VP2INTERSECT from SAPPHIRERAPIDS
2022-11-07Daily bump.GCC Administrator1-1/+1
2022-11-06Daily bump.GCC Administrator3-1/+22
2022-11-05doc: Document correct -fwide-exec-charset defaults [PR41041]Jonathan Wakely1-3/+4
As shown in the PR, the default is not UTF-32 but rather UTF-32BE or UTF-32LE, avoiding the need for a byte order mark in literals. gcc/ChangeLog: PR c/41041 * doc/cppopts.texi: Document -fwide-exec-charset defaults correctly. (cherry picked from commit e50ea3a42f058c14ee29327d5277ab0435e3d36b)
2022-11-04Fix recent thinko in operand_equal_pEric Botcazou5-14/+61
There is a thinko in a recent improvement made to operand_equal_p where the code just looks at operand 2 of COMPONENT_REF, if it is present, to compare addresses. That's wrong because operand 2 contains the number of DECL_OFFSET_ALIGN-bit-sized words so, when DECL_OFFSET_ALIGN > 8, not all the bytes are included and some of them are in DECL_FIELD_BIT_OFFSET, see get_inner_reference for the model computation. In other words, you would need to compare operand 2 and DECL_OFFSET_ALIGN and DECL_FIELD_BIT_OFFSET in this situation, but I'm not sure this is worth the hassle in practice so the fix just removes this alternate handling. gcc/ * fold-const.cc (operand_compare::operand_equal_p) <COMPONENT_REF>: Do not take into account operand 2. (operand_compare::hash_operand) <COMPONENT_REF>: Likewise. gcc/testsuite/ * gnat.dg/opt99.adb: New test. * gnat.dg/opt99_pkg1.ads, gnat.dg/opt99_pkg1.adb: New helper. * gnat.dg/opt99_pkg2.ads: Likewise.
2022-11-04Align with: "OpenMP/Fortran: 'target update' with DT components"Tobias Burnus2-7/+9
This commit partially undos the OG12 commit cb934e37962eeccc8641982b9a9855408979c767 OpenMP/Fortran: 'target update' with strides + DT components to match the mainline (GCC 13) version: r13-3625-g6629444170f85e9b1e243aa07e3e07a8b9f8fce5 OpenMP/Fortran: 'target update' with DT components The difference is that strides are not permitted in the mainline version; for the reason and to-do, see: https://gcc.gnu.org/PR107517 Interdiff changelog: 2022-11-04 Tobias Burnus <tobias@codesourcery.com> gcc/fortran/ChangeLog.omp Partial Revert: 2022-11-02 Tobias Burnus <tobias@codesourcery.com> * openmp.cc (resolve_omp_clauses):Accept noncontiguous arrays. libgomp/ChangeLog.omp * testsuite/libgomp.fortran/target-13.f90: Remove strides to match mainline (GCC 13) version. (cherry picked from commit 6629444170f85e9b1e243aa07e3e07a8b9f8fce5)
2022-11-04Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus12-5/+235
Merge up to r12-8891-g14a92220a2f061328aae32ee6b5cdc7f62375902 (4th Nov 2022) Note: This includes in principle some OpenMP/libgomp commits, but those have been cherry picked already.
2022-11-04Daily bump.GCC Administrator5-1/+120
2022-11-03i386: Fix uninitialized register after peephole2 conversion [PR107404]Uros Bizjak2-1/+55
The eliminate reg-reg move by inverting the condition of a cmove #2 peephole2 converts the following sequence: 473: bx:DI=[r14:DI*0x8+r12:DI] 960: r15:DI=r8:DI 485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;} 737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI} to: 1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;} 1111: r15:DI=[r14:DI*0x8+r12:DI] 1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI} Please note that(insn 1110) uses register BX, but its initialization was eliminated. Avoid conversion if eliminated move intialized a register, used in the moved instruction. 2022-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/107404 * config/i386/i386.md (eliminate reg-reg move by inverting the condition of a cmove #2 peephole2): Check if eliminated move initialized a register, used in the moved instruction. gcc/testsuite/ChangeLog: PR target/107404 * g++.target/i386/pr107404.C: New test. (cherry picked from commit 553b1d3dd5b9253ebdf66ee3260c717d5b807dd1)
2022-11-03c, c++: Fix up excess precision handling of scalar_to_vector conversion ↵Jakub Jelinek2-2/+32
[PR107358] As mentioned earlier in the C++ excess precision support mail, the following testcase is broken with excess precision both in C and C++ (though just in C++ it was triggered in real-world code). scalar_to_vector is called in both FEs after the excess precision promotions (or stripping of EXCESS_PRECISION_EXPR), so we can then get invalid diagnostics that say float vector + float involves truncation (on ia32 from long double to float). The following patch fixes that by calling scalar_to_vector on the operands before the excess precision promotions, let scalar_to_vector just do the diagnostics (it does e.g. fold_for_warn so it will fold EXCESS_PRECISION_EXPR around REAL_CST to constants etc.) but will then do the actual conversions using the excess precision promoted operands (so say if we have vector double + (float + float) we don't actually do vector double + (float) ((long double) float + (long double) float) but vector double + (double) ((long double) float + (long double) float) 2022-10-24 Jakub Jelinek <jakub@redhat.com> PR c++/107358 gcc/c/ * c-typeck.cc (build_binary_op): Pass operands before excess precision promotions to scalar_to_vector call. gcc/testsuite/ * c-c++-common/pr107358.c: New test. (cherry picked from commit 65e3274e363cb2c6bfe6b5e648916eb7696f7e2f)
2022-11-03c++: Fix up constexpr handling of char/signed char/short pre/post ↵Jakub Jelinek2-0/+27
inc/decrement [PR105774] signed char, char or short int pre/post inc/decrement are represented by normal {PRE,POST}_{INC,DEC}REMENT_EXPRs in the FE and only gimplification ensures that the {PLUS,MINUS}_EXPR is done in unsigned version of those types: case PREINCREMENT_EXPR: case PREDECREMENT_EXPR: case POSTINCREMENT_EXPR: case POSTDECREMENT_EXPR: { tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 0)); if (INTEGRAL_TYPE_P (type) && c_promoting_integer_type_p (type)) { if (!TYPE_OVERFLOW_WRAPS (type)) type = unsigned_type_for (type); return gimplify_self_mod_expr (expr_p, pre_p, post_p, 1, type); } break; } This means during constant evaluation we need to do it similarly (either using unsigned_type_for or using widening to integer_type_node). The following patch does the latter. 2022-10-24 Jakub Jelinek <jakub@redhat.com> PR c++/105774 * constexpr.cc (cxx_eval_increment_expression): For signed types that promote to int, evaluate PLUS_EXPR or MINUS_EXPR in int type. * g++.dg/cpp1y/constexpr-105774.C: New test. (cherry picked from commit da8c362c4c18cff2f2dfd5c4706bdda7576899a4)
2022-11-03tree-cfg: Fix a verification diagnostic typo [PR107121]Jakub Jelinek1-1/+1
Obvious typo in diagnostics. 2022-10-02 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/107121 * tree-cfg.cc (verify_gimple_call): Fix a typo in diagnostics, DEFFERED_INIT -> DEFERRED_INIT. (cherry picked from commit d01bd0b0f3b8f4c33c437ff10f0b949200627f56)
2022-11-03openmp: Fix ICE with taskgroup at -O0 -fexceptions [PR107001]Jakub Jelinek3-3/+29
The following testcase ICEs because with -O0 -fexceptions GOMP_taskgroup_end call isn't directly followed by GOMP_RETURN statement, but there are some conditionals to handle exceptions and we fail to find the correct GOMP_RETURN. The fix is to treat taskgroup similarly to target data, both of these constructs emit a try { body } finally { end_call } around the construct's body during gimplification and we need to see proper construct nesting during gimplification and omp lowering (including nesting of regions checks), but during omp expansion we don't really need their nesting anymore, all we need is emit something at the start of the region and the end of the region is the end API call we've already emitted during gimplification. For target data, we weren't adding GOMP_RETURN statement during omp lowering, so after that pass it is treated merely like stand-alone omp directives. This patch does the same for taskgroup too. 2022-09-24 Jakub Jelinek <jakub@redhat.com> PR c/107001 * omp-low.cc (lower_omp_taskgroup): Don't add GOMP_RETURN statement at the end. * omp-expand.cc (build_omp_regions_1): Clarify GF_OMP_TARGET_KIND_DATA is not stand-alone directive. For GIMPLE_OMP_TASKGROUP, also don't update parent. (omp_make_gimple_edges) <case GIMPLE_OMP_TASKGROUP>: Reset cur_region back after new_omp_region. * c-c++-common/gomp/pr107001.c: New test. (cherry picked from commit ad2aab5c816a6fd56b46210c0a4a4c6243da1de9)
2022-11-03openmp, c: Tighten up c_tree_equal [PR106981]Jakub Jelinek2-6/+19
This patch changes c_tree_equal to work more like cp_tree_equal, be more strict in what it accepts. The ICE on the first testcase was due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which ICEs if the two constants have different precision, but as the second testcase shows, being too lenient in it can also lead to miscompilation of valid OpenMP programs where we think certain expression is the same even when it isn't and can be guaranteed at runtime to represent different memory location. So, the patch looks through only NON_LVALUE_EXPRs and for constants as well as casts requires that the types match before actually comparing the constant values or recursing on the cast operands. 2022-09-24 Jakub Jelinek <jakub@redhat.com> PR c/106981 gcc/c/ * c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and t2 have different types. gcc/testsuite/ * c-c++-common/gomp/pr106981.c: New test. libgomp/ * testsuite/libgomp.c-c++-common/pr106981.c: New test. (cherry picked from commit 3c5bccb608c665ac3f62adb1817c42c845812428)
2022-11-03openmp: Fix handling of target constructs in static member functions [PR106829]Jakub Jelinek2-9/+23
Just calling current_nonlambda_class_type in static member functions returns non-NULL, but something that isn't *this and if unlucky can match part of the IL and can be added to target clauses. if (DECL_NONSTATIC_MEMBER_P (decl) && current_class_ptr) is a guard used elsewhere (in check_accessibility_of_qualified_id). 2022-09-07 Jakub Jelinek <jakub@redhat.com> PR c++/106829 * semantics.cc (finish_omp_target_clauses): If current_function_decl isn't a nonstatic member function, don't set data.current_object to non-NULL. * g++.dg/gomp/pr106829.C: New test. (cherry picked from commit e90af965e5c858ba02c0cdfbac35d0a19da1c2f6)
2022-11-03Daily bump.GCC Administrator1-1/+1
2022-11-02Fortran/OpenMP: Fix DT struct-component with 'alloc' and array descrTobias Burnus2-1/+7
When using 'map(alloc: var, dt%comp)' needs to have a 'to' mapping of the array descriptor as otherwise the bounds are not available in the target region. - Likewise for character strings. This patch implements this; however, some additional issues are exposed by the testcase; those are '#if 0'ed and will be handled later. Submitted to mainline (but pending review): https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604887.html gcc/fortran/ChangeLog: * trans-openmp.cc (gfc_trans_omp_clauses): Ensure DT struct-comp with array descriptor and 'alloc:' have the descriptor mapped with 'to:'. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-enter-data-3.f90: New test.
2022-11-02OpenMP/Fortran: 'target update' with strides + DT componentsTobias Burnus3-9/+27
OpenMP 5.0 permits to use arrays with strides and derived type components for the list items to the 'from'/'to' clauses of the 'target update' directive. Submitted to mainline (but pending review): https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604687.html gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_clauses): Permit derived types. (resolve_omp_clauses):Accept noncontiguous arrays. * trans-openmp.cc (gfc_trans_omp_clauses): Fixes for derived-type changes; fix size for scalars. libgomp/ChangeLog: * testsuite/libgomp.fortran/target-11.f90: New test. * testsuite/libgomp.fortran/target-13.f90: New test.
2022-11-02Merge branch 'releases/gcc-12' into devel/omp/gcc-12Tobias Burnus11-2/+207
Merge up to r12-8881-gb80a690673272919896ee5939250e50d882f2418 (2nd Nov 2022)
2022-11-02amdgcn: Enable SIMD vectorization of math functionsKwok Cheung Yeung5-0/+339
Calls to vectorized versions of routines in the math library will now be inserted when vectorizing code containing supported math functions. 2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com> Paul-Antoine Arras <pa@codesourcery.com> gcc/ * builtins.cc (mathfn_built_in_explicit): New. * config/gcn/gcn.cc: Include case-cfn-macros.h. (mathfn_built_in_explicit): Add prototype. (gcn_vectorize_builtin_vectorized_function): New. (gcn_libc_has_function): New. (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION): Define. (TARGET_LIBC_HAS_FUNCTION): Define. gcc/testsuite/ * gcc.target/gcn/simd-math-1.c: New testcase. libgomp/ * testsuite/libgomp.c/simd-math-1.c: New testcase.
2022-11-02Daily bump.GCC Administrator1-1/+1
2022-11-01amdgcn: Add builtins for vector floor/floorfKwok Cheung Yeung3-0/+39
2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn-builtins.def (FLOORVF): New builtin. (FLOORV): New builtin. * config/gcn/gcn.cc (gcn_expand_builtin_1): Expand GCN_BUILTIN_FLOORVF and GCN_BUILTIN_FLOORV.
2022-11-01amdgcn: Fix expansion of builtin for vector fabs operationKwok Cheung Yeung2-3/+6
2022-11-01 Kwok Cheung Yeung <kcy@codesourcery.com> * config/gcn/gcn.cc (gcn_expand_builtin_1): Fix expansion of GCN_BUILTIN_FABSV.
2022-11-01openmp: Bugfix in omp_expand_metadirective for same blocks/edges to be deleted.Marcel Vollweiler4-0/+26
This patch handles an ICE that is thrown in omp_expand_metadirective when a basic_block for a metadirective label is tried to be deleted multiple times. To avoid this situation, processed labels are added to the already existing list of labels that are not intended to be deleted. The issue occured in the attached test case. gcc/ChangeLog: * omp-expand-metadirective.cc (omp_expand_metadirective): Add already processed labels to "labels" (the list of labels not to be deleted). gcc/testsuite/ChangeLog: * c-c++-common/gomp/metadirective-8.c: New test.
2022-11-01amdgcn: add fmin/fmax patternsAndrew Stubbs3-0/+43
Add fmin/fmax for scalar, vector, and reductions. The smin/smax patterns are already using the IEEE compliant hardware instructions anyway, so we can just expand to use those insns. gcc/ChangeLog: * config/gcn/gcn-valu.md (fminmaxop): New iterator. (<fexpander><mode>3): New define_expand. (<fexpander><mode>3<exec>): Likewise. (reduc_<fexpander>_scal_<mode>): Likewise. * config/gcn/gcn.md (fexpander): New attribute. (cherry picked from commit 10aa0356118f44e5f4d720a2a4c731b173baa298)
2022-11-01amdgcn: multi-size vector reductionsAndrew Stubbs4-94/+69
Add support for vector reductions for any vector width by switching iterators and generalising the code slightly. There's no one-instruction way to move an item from lane 31 to lane 0 (63, 15, 7, 3, and 1 are all fine though), and vec_extract is probably fewer cycles anyway, so now we always reduce to an SGPR. gcc/ChangeLog: * config/gcn/gcn-valu.md (V64_SI): Delete iterator. (V64_DI): Likewise. (V64_1REG): Likewise. (V64_INT_1REG): Likewise. (V64_2REG): Likewise. (V64_ALL): Likewise. (V64_FP): Likewise. (reduc_<reduc_op>_scal_<mode>): Use V_ALL. Use gen_vec_extract. (fold_left_plus_<mode>): Use V_FP. (*<reduc_op>_dpp_shr_<mode>): Use V_1REG. (*<reduc_op>_dpp_shr_<mode>): Use V_DI. (*plus_carry_dpp_shr_<mode>): Use V_INT_1REG. (*plus_carry_in_dpp_shr_<mode>): Use V_SI. (*plus_carry_dpp_shr_<mode>): Use V_DI. (mov_from_lane63_<mode>): Delete. (mov_from_lane63_<mode>): Delete. * config/gcn/gcn.cc (gcn_expand_reduc_scalar): Support partial vectors. * config/gcn/gcn.md (unspec): Remove UNSPEC_MOV_FROM_LANE63. (cherry picked from commit f539029c1ce6fb9163422d1a8b6ac12a2554eaa2)
2022-11-01Daily bump.GCC Administrator1-1/+1
2022-10-31Daily bump.GCC Administrator1-1/+1