aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-05-19Use OpenACC code to process OpenMP target regionsdevel/omp/gcc-12Chung-Lin Tang29-58/+1079
This is a backport of: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html This patch implements '-fopenmp-target=acc', which enables internally handling a subset of OpenMP target regions as OpenACC parallel regions. This basically includes target, teams, parallel, distribute, for/do constructs, and atomics. Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code paths handle them, with various needed adjustments throughout middle-end and nvptx backend. When using this "OMPACC" mode, if there are cases the patch doesn't handle, it issues a warning, and reverts to normal processing for that target region. gcc/ChangeLog: * builtins.cc (expand_builtin_omp_builtins): New function. (expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER, BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS, BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using expand_builtin_omp_builtins, enabled under -fopenmp-target=acc. * cgraphunit.cc (analyze_functions): Add call to omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc. * common.opt (fopenmp-target=): Add new option and enums. * config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=. * config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New prototype. (nvptx_mem_shared_p): Likewise. * config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX symbol for number of threads in team. (omp_num_threads_align): New var for alignment of omp_num_threads_sym. (need_omp_num_threads): New bool for if any function references omp_num_threads_sym. (nvptx_option_override): Initialize omp_num_threads_sym/align. (write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode. (nvptx_declare_function_name): Disable shim function under OMPACC mode. Disable soft-stack under OMPACC mode. Add generation of neutering init code under OMPACC mode. (nvptx_output_set_softstack): Return "" under OMPACC mode. (nvptx_expand_call): Set parallelism to vector for function calls with "ompacc for" attached. (nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode. (nvptx_expand_oacc_join): Likewise. (nvptx_expand_omp_get_num_threads): New function. (nvptx_mem_shared_p): New function. (nvptx_mach_max_workers): Return 1 under OMPACC mode. (nvptx_mach_vector_length): Return 32 under OMPACC mode. (nvptx_single): Add adjustments for OMPACC mode, which have parallel-construct fork/joins, and regions of code where neutering is dynamically determined. (nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for" attribute is attached to function. Disable uniform-simt when under OMPACC mode. (nvptx_file_end): Write __nvptx_omp_num_threads out when needed. (nvptx_goacc_fork_join): Return true under OMPACC mode. * config/nvptx/nvptx.h (struct GTY(()) machine_function): Add omp_parallel_predicate and omp_fn_entry_num_threads_reg fields. * config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID, UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID, UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries. (nvptx_shared_mem_operand): New predicate. (gomp_barrier): New expand pattern. (omp_get_num_threads): New expand pattern. (omp_get_num_teams): New insn pattern. (omp_get_thread_num): Likewise. (omp_get_team_num): Likewise. (get_ntid): Likewise. (nvptx_omp_parallel_fork): Likewise. (nvptx_omp_parallel_join): Likewise. * flag-types.h (omp_target_mode_kind): New flag value enum. * gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field. (gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_ctx_ompacc_p): New function. (gimplify_omp_for): Handle combined loops under OMPACC. * lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_. * omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST. (BUILT_IN_OMP_GET_NUM_THREADS): Likewise. * omp-expand.cc (remove_exit_barrier): Disable addressable-var processing for parallel construct child functions under OMPACC mode. (expand_oacc_for): Add OMPACC mode handling. (get_target_arguments): Force thread_limit clause value to 1 under OMPACC mode. (expand_omp): Under OMPACC mode, avoid child function expanding of GIMPLE_OMP_PARALLEL. * omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode. * omp-low.cc (struct omp_context): Add 'bool ompacc_p' field. (scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_. (ompacc_ctx_p): New function. (scan_omp_parallel): Handle OMPACC mode, avoid creating child function. (scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target construct child function, remove OMP_CLAUSE__OMPACC_ clauses. (lower_oacc_head_mark): Handle OMPACC mode cases. (lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add vector/gang clauses as needed. Add other OMPACC handling. (lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case. (lower_omp_target): Do OpenACC gang privatization under OMPACC case. (lower_omp_teams): Forward OpenACC privatization variables to outer target region under OMPACC mode. (lower_omp_1): Do OpenACC gang privatization under OMPACC case for GIMPLE_BIND. * omp-offload.cc (ompacc_supported_clauses_p): New function. (struct target_region_data): New struct type for tree walk. (scan_fndecl_for_ompacc): New function. (scan_omp_target_region_r): New function. (scan_omp_target_construct_r): New function. (omp_ompacc_attribute_tagging): New function. (oacc_dim_call): Add OMPACC case handling. (execute_oacc_device_lower): Make parts explicitly only OpenACC enabled. (pass_oacc_device_lower::gate): Enable pass under OMPACC mode. * omp-offload.h (omp_ompacc_attribute_tagging): New prototype. * opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp and no -fopenacc. * target-insns.def (gomp_barrier): New defined insn pattern. (omp_get_thread_num): Likewise. (omp_get_num_threads): Likewise. (omp_get_team_num): Likewise. (omp_get_num_teams): Likewise. * tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry for internal clause. * tree-nested.cc (convert_nonlocal_omp_clauses): Handle OMP_CLAUSE__OMPACC_. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_. * tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry. (omp_clause_code_name): Likewise. * tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_. * tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC mode cases. libgomp/ChangeLog: * config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in shared memory.
2023-05-18LTO: Fix writing of toplevel asm with offloading [PR109816]Tobias Burnus3-1/+105
When offloading was enabled, top-level 'asm' were added to the offloading section, confusing assemblers which did not support the syntax. Additionally, with offloading and -flto, the top-level assembler code did not end up in the host files. As r14-321-g9a41d2cdbcd added top-level 'asm' to one libstdc++ header file, the issue became more apparent, causing fails with nvptx for some C++ testcases. PR libstdc++/109816 gcc/ChangeLog: * lto-cgraph.cc (output_symtab): Guard lto_output_toplevel_asms by '!lto_stream_offload_p'. libgomp/ChangeLog: * testsuite/libgomp.c++/target-map-class-1.C: New test. * testsuite/libgomp.c++/target-map-class-2.C: New test. (cherry picked from commit a835f046cdf017b9e8ad5576df4f10daaf8420d0)
2023-05-11Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]Tobias Burnus2-4/+45
This is one part of the fix for PR109128, along with a corresponding binutils's linker change. Without this patch, what happens in the linker, when an unused object in a .a file has offload data, is that elf_link_is_defined_archive_symbol calls bfd_link_plugin_object_p, which ends up calling the plugin's claim_file_handler, which then records the object as one with offload data. That is, the linker never decides to use the object in the first place, but use of this _p interface (called as part of trying to decide whether to use the object) results in the plugin deciding to use its offload data (and a consequent mismatch in the offload data present at runtime). The new hook allows the linker plugin to distinguish calls to claim_file_handler that know the object is being used by the linker (from ldmain.c:add_archive_element), from calls that don't know it's being used by the linker (from elf_link_is_defined_archive_symbol); in the latter case, the plugin should avoid recording the object as one with offload data. PR middle-end/109128 include/ * plugin-api.h (ld_plugin_claim_file_handler_v2) (ld_plugin_register_claim_file_v2) (LDPT_REGISTER_CLAIM_FILE_HOOK_V2): New. (struct ld_plugin_tv): Add tv_register_claim_file_v2. lto-plugin/ * lto-plugin.c (register_claim_file_v2): New. (claim_file_handler_v2): New. (claim_file_handler): Wrap claim_file_handler_v2. (onload): Handle LDPT_REGISTER_CLAIM_FILE_HOOK_V2. (cherry picked from commit c49d51fa8134f6c7e6c7cf6e4e3007c4fea617c5)
2023-05-04openmp: Fix initialization for 'unroll full'Frederik Harwath18-8/+79
The index variable initialization for the 'omp unroll' directive with 'full' clause got lost and the testsuite did not catch it. Add the initialization and add -Wall to some tests to detect uninitialized variable uses and other potential problems in the code generation. gcc/ChangeLog: * omp-transform-loops.cc (full_unroll): Add initialization of index variable. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: Use -Wall and add -Wno-unknown-pragmas to disable warnings about empty pragmas. Use -O2. * testsuite/libgomp.c++/loop-transforms/matrix-no-directive-unroll-full-1.C: Copy of testsuite/libgomp.c-c++-common/matrix-no-directive-unroll-full-1.c, but using -O0 which works only for C++. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: Use -Wall and use -Wno-unknown-pragmas to disable warnings about empty pragmas. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: Likewise and fix broken function calls found by -Wall.
2023-04-27amdgcn: Fix addsub bugAndrew Stubbs2-8/+24
The vec_fmsubadd instuction actually had add twice, by mistake. Also improve code-gen for all the complex patterns by using properly undefined values. Mostly this just prevents the compiler reserving space in the stack frame. gcc/ChangeLog: * config/gcn/gcn-valu.md (cmul<conj_op><mode>3): Use gcn_gen_undef. (cml<addsub_as><mode>4): Likewise. (vec_addsub<mode>3): Likewise. (cadd<rot><mode>3): Likewise. (vec_fmaddsub<mode>4): Likewise. (vec_fmsubadd<mode>4): Likewise, and use sub for the odd lanes.
2023-04-25openmp: Fix loop transformation testsFrederik Harwath4-2/+12
libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: Add reduction clause. * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: Initialize var. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: Add reduction and initialization.
2023-04-21amdgcn: bug fix ldexp insnAndrew Stubbs2-16/+18
The vop3 instructions don't support B constraint immediates. Also, take the use the SV_FP iterator to delete a redundant pattern. gcc/ChangeLog: * config/gcn/gcn-valu.md (vnsi, VnSI): Add scalar modes. (ldexp<mode>3): Delete. (ldexp<mode>3<exec>): Change "B" to "A".
2023-04-21amdgcn: update target-supports.expAndrew Stubbs2-5/+22
The backend can now vectorize more things. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect_call_copysignf): Add amdgcn. (check_effective_target_vect_call_sqrtf): Add amdgcn. (check_effective_target_vect_call_ceilf): Add amdgcn. (check_effective_target_vect_call_floor): Add amdgcn. (check_effective_target_vect_logical_reduc): Add amdgcn.
2023-04-20amdgcn, openmp: Fix concurrency in low-latency allocatorAndrew Stubbs2-0/+7
The previous code works fine on Fiji and Vega 10 devices, but bogs down in The spin locks on Vega 20 or newer. Adding the sleep instructions fixes the problem. libgomp/ChangeLog: * basic-allocator.c (basic_alloc_free): Use BASIC_ALLOC_YIELD. (basic_alloc_realloc): Use BASIC_ALLOC_YIELD.
2023-04-20Add missing changelog entriesAndrew Stubbs2-0/+25
2023-04-18amdgcn: HardFP divideAndrew Stubbs4-97/+144
Implement FP division using hardware instructions. This replaces both the softfp library calls, and the --fast-math inaccurate divsion we had previously. The GCN architecture does not have a single divide instruction, but it does have a number of support instructions designed to make multiply-by-reciprocal sufficiently accurate for non-fast-math usage. gcc/ChangeLog: * config/gcn/gcn-valu.md (SV_SFDF): New iterator. (SV_FP): New iterator. (scalar_mode, SCALAR_MODE): Add identity mappings for scalar modes. (recip<mode>2): Unify the two patterns using SV_FP. (div_scale<mode><exec_vcc>): New insn. (div_fmas<mode><exec>): New insn. (div_fixup<mode><exec>): New insn. (div<mode>3): Unify the two expanders and rewrite using hardfp. * config/gcn/gcn.cc (gcn_md_reorg): Support "vccwait" attribute. * config/gcn/gcn.md (unspec): Add UNSPEC_DIV_SCALE, UNSPEC_DIV_FMAS, and UNSPEC_DIV_FIXUP. (vccwait): New attribute. gcc/testsuite/ChangeLog: * gcc.target/gcn/fpdiv.c: Remove the -ffast-math requirement. (cherry picked from commit cfdc45f73c56ad051a53576a4e88675ced2660d4)
2023-04-13if-conv: Restore MASK_CALL conversion [PR108888]Andre Vieira6-5/+24
The original patch to fix this PR broke the if-conversion of calls into IFN_MASK_CALL. This patch restores that original behaviour and makes sure the tests added earlier specifically test inbranch SIMD clones. gcc/ChangeLog: PR tree-optimization/108888 * tree-if-conv.cc (predicate_statements): Fix gimple call check. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16.c: Make simd clone inbranch only. * gcc.dg/vect/vect-simd-clone-17.c: Likewise. * gcc.dg/vect/vect-simd-clone-18.c: Likewise. (cherry picked from commit 58c8c1b383bc3c286d6527fc6e8fb62463f9a877)
2023-04-05[og12] OpenMP: Fix checking ICE in "declare target" ctor/dtor supportJulian Brown2-12/+42
This patch fixes an ICE with checking enabled with the patch: abcb5dbac666513c798e574808f849f76a1c0799 There were two problems: first, OMP_CLAUSE_CHAIN was erroneously used as the chain pointer instead of TREE_CHAIN for a non-OMP clause list. Secondly, "copy_node" by itself is not sufficient to clone the initialization statement for use in the on-target constructor/destructor function. Instead we now use walk_tree with "copy_tree_body_r" and appropriate configuration parameters. 2023-04-05 Julian Brown <julian@codesourcery.com> gcc/cp/ * decl2.cc (tree-inline.h): Include. (do_static_initialization_or_destruction): Change OMP_TARGET parameter to pass the host version of the SSDF function decl. Use copy_tree_body_r to clone init stmt. Update forward declaration. (c_parse_final_cleanups): Update calls to do_static_initialization_or_destruction. Use TREE_CHAIN instead of OMP_CLAUSE_CHAIN.
2023-04-03amdgcn: Add 64-bit vector notAndrew Stubbs2-0/+21
gcc/ChangeLog: * config/gcn/gcn-valu.md (one_cmpl<mode>2<exec>): New.
2023-04-03'-foffload-memory=pinned' using offloading device interfaces for ↵Thomas Schwinge2-5/+53
non-contiguous array support Changes related to og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". libgomp/ * target.c (gomp_map_vars_internal) <non-contiguous array support>: Handle 'always_pinned_mode'.
2023-04-03'-foffload-memory=pinned' using offloading device interfacesThomas Schwinge15-153/+1339
Implemented for nvptx offloading via 'cuMemHostAlloc', 'cuMemHostRegister'. gcc/ * doc/invoke.texi (-foffload-memory=pinned): Document. include/ * cuda/cuda.h (CUresult): Add 'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'. (CUdevice_attribute): Add 'CU_DEVICE_ATTRIBUTE_READ_ONLY_HOST_REGISTER_SUPPORTED'. (CU_MEMHOSTREGISTER_READ_ONLY): Add. (cuMemHostGetFlags, cuMemHostRegister, cuMemHostUnregister): Add. libgomp/ * libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add 'struct goacc_asyncqueue *' formal parameter. (GOMP_OFFLOAD_page_locked_host_register) (GOMP_OFFLOAD_page_locked_host_unregister) (GOMP_OFFLOAD_page_locked_host_p): Add. * libgomp.h (always_pinned_mode) (gomp_page_locked_host_register_dev) (gomp_page_locked_host_unregister_dev): Add. (struct splay_tree_key_s): Add 'page_locked_host_p'. (struct gomp_device_descr): Add 'GOMP_OFFLOAD_page_locked_host_register', 'GOMP_OFFLOAD_page_locked_host_unregister', 'GOMP_OFFLOAD_page_locked_host_p'. * libgomp.texi (-foffload-memory=pinned): Document. * plugin/cuda-lib.def (cuMemHostGetFlags, cuMemHostRegister_v2) (cuMemHostRegister, cuMemHostUnregister): Add. * plugin/plugin-nvptx.c (struct ptx_device): Add 'read_only_host_register_supported'. (nvptx_open_device): Initialize it. (free_host_blocks, free_host_blocks_lock) (nvptx_run_deferred_page_locked_host_free) (nvptx_page_locked_host_free_callback, nvptx_page_locked_host_p) (GOMP_OFFLOAD_page_locked_host_register) (nvptx_page_locked_host_unregister_callback) (GOMP_OFFLOAD_page_locked_host_unregister) (GOMP_OFFLOAD_page_locked_host_p) (nvptx_run_deferred_page_locked_host_unregister) (nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback): Add. (GOMP_OFFLOAD_fini_device, GOMP_OFFLOAD_page_locked_host_alloc) (GOMP_OFFLOAD_run): Call 'nvptx_run_deferred_page_locked_host_free'. (struct goacc_asyncqueue): Add 'page_locked_host_unregister_blocks_lock', 'page_locked_host_unregister_blocks'. (nvptx_goacc_asyncqueue_construct) (nvptx_goacc_asyncqueue_destruct): Handle those. (GOMP_OFFLOAD_page_locked_host_free): Handle 'struct goacc_asyncqueue *' formal parameter. (GOMP_OFFLOAD_openacc_async_test) (nvptx_goacc_asyncqueue_synchronize): Call 'nvptx_run_deferred_page_locked_host_unregister'. (GOMP_OFFLOAD_openacc_async_serialize): Call 'nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback'. * config/linux/allocator.c (linux_memspace_alloc) (linux_memspace_calloc, linux_memspace_free) (linux_memspace_realloc): Remove 'always_pinned_mode' handling. (GOMP_enable_pinned_mode): Move... * target.c: ... here. (always_pinned_mode, verify_always_pinned_mode) (gomp_verify_always_pinned_mode, gomp_page_locked_host_alloc_dev) (gomp_page_locked_host_free_dev) (gomp_page_locked_host_aligned_alloc_dev) (gomp_page_locked_host_aligned_free_dev) (gomp_page_locked_host_register_dev) (gomp_page_locked_host_unregister_dev): Add. (gomp_copy_host2dev, gomp_map_vars_internal) (gomp_remove_var_internal, gomp_unmap_vars_internal) (get_gomp_offload_icvs, gomp_load_image_to_device) (gomp_target_rev, omp_target_memcpy_copy) (omp_target_memcpy_rect_worker): Handle 'always_pinned_mode'. (gomp_copy_host2dev, gomp_copy_dev2host): Handle 'verify_always_pinned_mode'. (GOMP_target_ext): Add 'assert'. (gomp_page_locked_host_alloc): Use 'gomp_page_locked_host_alloc_dev'. (gomp_page_locked_host_free): Use 'gomp_page_locked_host_free_dev'. (omp_target_associate_ptr): Adjust. (gomp_load_plugin_for_device): Handle 'page_locked_host_register', 'page_locked_host_unregister', 'page_locked_host_p'. * oacc-mem.c (memcpy_tofrom_device): Handle 'always_pinned_mode'. * libgomp_g.h (GOMP_enable_pinned_mode): Adjust. * testsuite/libgomp.c/alloc-pinned-7.c: Remove.
2023-04-03OpenACC: Pass pre-allocated 'ptrblock' to ↵Thomas Schwinge4-6/+13
'goacc_noncontig_array_create_ptrblock' [PR76739] ... to simplify later changes. No functional change. Follow-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". PR other/76739 libgomp/ * target.c (gomp_map_vars_internal): Pass pre-allocated 'ptrblock' to 'goacc_noncontig_array_create_ptrblock'. * oacc-parallel.c (goacc_noncontig_array_create_ptrblock): Adjust. * oacc-int.h (goacc_noncontig_array_create_ptrblock): Adjust.
2023-04-03libgomp: Document OpenMP 'pinned' memoryThomas Schwinge2-0/+13
libgomp/ * libgomp.texi (AMD Radeon, nvptx): Document OpenMP 'pinned' memory.
2023-04-02Resolve 'error: unused parameter' in ↵Thomas Schwinge2-6/+9
'gcc/cp/decl2.cc:one_static_initialization_or_destruction' [...]/gcc/cp/decl2.cc: In function ‘void one_static_initialization_or_destruction(tree, tree, bool, bool)’: [...]/gcc/cp/decl2.cc:4171:48: error: unused parameter ‘omp_target’ [-Werror=unused-parameter] 4171 | bool omp_target) | ~~~~~^~~~~~~~~~ cc1plus: all warnings being treated as errors make[3]: *** [cp/decl2.o] Error 1 Fix-up for og12 commit abcb5dbac666513c798e574808f849f76a1c0799 "[og12] OpenMP: Constructors and destructors for "declare target" static aggregates". gcc/cp/ * decl2.cc (one_static_initialization_or_destruction): Remove 'omp_target' formal parameter. Adjust all users.
2023-03-31openmp: Handle GIMPLE_OMP_METADIRECTIVE in walk_omp_for_loopsFrederik Harwath2-0/+20
gcc/ChangeLog: * omp-transform-loops.cc (walk_omp_for_loops): Handle GIMPLE_OMP_METADIRECTIVE.
2023-03-28Add missing ChangeLog.omp entriesFrederik Harwath6-0/+413
... for commits b35b06e9d39..09f39bb1141
2023-03-28Add missing ChangeLog.omp entriesJulian Brown3-0/+31
2023-03-27[og12] OpenMP: Constructors and destructors for "declare target" static ↵Julian Brown5-35/+257
aggregates This patch adds support for running constructors and destructors for static (file-scope) aggregates for C++ objects which are marked with "declare target" directives on OpenMP offload targets. At present, space is allocated on the target for such aggregates, but nothing ever constructs them properly, so they end up zero-initialised. Tested with offloading to AMD GCN. I will apply to the og12 branch shortly. ChangeLog 2023-03-27 Julian Brown <julian@codesourcery.com> gcc/cp/ * decl2.cc (priority_info): Add omp_tgt_initializations_p and omp_tgt_destructions_p. (start_objects, start_static_storage_duration_function, do_static_initialization_or_destruction, one_static_initialization_or_destruction, generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support "declare target" decls. Update forward declarations. (OMP_SSDF_IDENTIFIER): New macro. (omp_tgt_ssdf_decls): New vec. (get_priority_info): Initialize omp_tgt_initializations_p and omp_tgt_destructions_p fields. (handle_tls_init): Update call to omp_static_initialization_or_destruction. (c_parse_final_cleanups): Support constructors/destructors on OpenMP offload targets. gcc/ * omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): New builtin. * tree.cc (get_file_function_name): Support names for on-target constructor/destructor functions. libgomp/ * testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New test. * testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New test.
2023-03-27openmp: Add C/C++ support for loop transformations on inner loopsFrederik Harwath25-77/+793
Add the parsing of loop transformations on inner loops of a loop-nest. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (c_parser_omp_tile): ... adjust use here, (c_parser_omp_unroll): ... and here, (c_parser_omp_for_loop): ... and here. Stop treating loop transformations like intervening code, parse them, and adjust the loop-nest depth if necessary for tiling. gcc/cp/ChangeLog: * parser.cc (cp_parser_is_pragma): New function. (cp_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (cp_parser_omp_tile): ... adjust use here, (cp_parser_omp_unroll): ... and here, (cp_parser_omp_for_loop): ... and here. Stop treating loop gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-inner-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-inner-2.c: New test. libgomp/ChangeLog * testsuite/libgomp.c++/loop-transforms/tile-1.C: Deleted, replaced by matrix-* tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-1.h: New header file for new tests. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-constant-iter.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-helper.h: Likewise. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-no-directive-unroll-full-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-parallel-masked-taskloop-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-target-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-taskloop-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-omp-teams-distribute-parallel-for-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-simd-1.c: New test. * testsuite/libgomp.c-c++-common/loop-transforms/matrix-transform-variants-1.h: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-non-rect-1.c: New test.
2023-03-27openmp: Add Fortran support for loop transformations on inner loopsFrederik Harwath29-138/+1112
So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the level of the loop-nest at which a transformation should be applied. The middle end transformation pass is adjusted to apply the transformations at the right level of a loop nest and to take their effect on the loop nest depth into account. gcc/fortran/ChangeLog: * openmp.cc (omp_unroll_removes_loop_nest): Move down in file. (resolve_loop_transform_generic): Remove, and ... (resolve_omp_unroll): ... inline and adapt here. Move function. Move functin. (find_nested_loop_in_block): New function. (find_nested_loop_in_chain): New function, used ... (is_outer_iteration_variable): ... here, and ... (expr_is_invariant): ... here. (resolve_omp_do): Adjust code for resolving loop transformations. (resolve_omp_tile): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL on new clause. (compute_transformed_depth): New function to compute the depth ("collapse") of a transformed loop nest, used (gfc_trans_omp_do): ... here. gcc/ChangeLog: * omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type in comment. (gomp_for_uncollapse): Adjust "collapse" value after uncollapse. (partial_unroll): Add argument for the loop nest level to be transformed. (tile): Likewise. (transform_gomp_for): Pass level to transformatoin functions. (optimize_transformation_clauses): Handle transformation clauses for all levels recursively. * tree-pretty-print.cc (dump_omp_clause): Print OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access clause operand 0. (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0. (OMP_CLAUSE_TILE_SIZES): Likewise. gcc/cp/ChangeLog * parser.cc (cp_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (cp_parser_omp_clause_unroll_partial): Likewise. (cp_parser_omp_tile_sizes): Likewise. (cp_parser_omp_loop_transform_clause): Likewise. (cp_parser_omp_nested_loop_transform_clauses): Likewise. (cp_parser_omp_unroll): Likewise. * pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL and OMP_CLAUSE_TILE handling to changed number of operands. gcc/c/ChangeLog * c-parser.cc (c_parser_omp_clause_unroll_full): Set new OMP_CLAUSE_TRANSFORM_LEVEL operand to default value. (c_parser_omp_clause_unroll_partial): Likewise. (c_parser_omp_tile_sizes): Likewise. (c_parser_omp_loop_transform_clause): Likewise. (c_parser_omp_nested_loop_transform_clauses): Likewise. (c_parser_omp_unroll): Likewise. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust. * gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to changed diagnostic messages. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test.
2023-03-27openmp: Add C/C++ support for "omp tile"Frederik Harwath23-102/+1823
This commit adds the C and C++ front end support for the "omp tile" directive. The middle end support for the transformation is implemented in a previous commit. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE. * c-pragma.cc (omp_pragmas_simd): Likewise. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_TILE gcc/c/ChangeLog: * c-parser.cc (c_parser_nested_omp_unroll_clauses): Rename and generalize ... (c_parser_omp_nested_loop_transform_clauses): ... to this. (c_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (c_parser_omp_tile_sizes): Parse single "sizes" clause. (c_parser_omp_loop_transform_clause): New function. (c_parser_omp_tile): New function for parsing "omp tile" (c_parser_omp_unroll): Adjust to renaming. (c_parser_omp_construct): Handle PRAGMA_OMP_TILE. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_unroll_partial): Adjust. (cp_parser_nested_omp_unroll_clauses): Rename ... (cp_parser_omp_nested_loop_transform_clauses): ... to this. (cp_parser_omp_for_loop): Handle "omp tile" parsing in loop nests. (cp_parser_omp_tile_sizes): New function, parses single "sizes" clause (cp_parser_omp_tile): New function for parsing "omp tile". (cp_parser_omp_loop_transform_clause): New function. (cp_parser_omp_unroll): Adjust to renaming. (cp_parser_omp_construct): Handle PRAGMA_OMP_TILE. (cp_parser_pragma): Likewise. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_TILE. * semantics.cc (finish_omp_clauses): Likewise. gcc/ChangeLog: * gimplify.cc (omp_for_drop_tile_clauses): New function, ... (gimplify_omp_for): ... used here. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/tile-1.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-2.C: New test. * testsuite/libgomp.c++/loop-transforms/tile-3.C: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/tile-1.c: New test. * c-c++-common/gomp/loop-transforms/tile-2.c: New test. * c-c++-common/gomp/loop-transforms/tile-3.c: New test. * c-c++-common/gomp/loop-transforms/tile-4.c: New test. * c-c++-common/gomp/loop-transforms/tile-5.c: New test. * c-c++-common/gomp/loop-transforms/tile-6.c: New test. * c-c++-common/gomp/loop-transforms/tile-7.c: New test. * c-c++-common/gomp/loop-transforms/tile-8.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: Adapt to changed diagnostic messages. * g++.dg/gomp/loop-transforms/tile-1.h: New test. * g++.dg/gomp/loop-transforms/tile-1a.C: New test. * g++.dg/gomp/loop-transforms/tile-1b.C: New test.
2023-03-27openmp: Add Fortran support for "omp tile"Frederik Harwath33-120/+2038
This commit implements the Fortran front end support for the "omp tile" directive and the corresponding middle end transformation. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE. (enum gfc_exec_op): Add EXEC_OMP_TILE. (loop_transform_p): New declaration. (struct gfc_omp_clauses): Add "tile_sizes" field. * dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping. (show_omp_node): Handle EXEC_OMP_TILE. (show_code_node): Likewise. * match.h (gfc_match_omp_tile): New declaration. * openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field. (match_tile_sizes): New function. (OMP_TILE_CLAUSES): New macro. (gfc_match_omp_tile): New function. (resolve_omp_do): Handle EXEC_OMP_TILE. (resolve_omp_tile): New function. (omp_code_to_statement): Handle EXEC_OMP_TILE. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE and ST_OMP_TILE. (next_statement): Handle ST_OMP_TILE. (gfc_ascii_statement): Likewise. (parse_omp_do): Likewise. (parse_executable): Likewise. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field. (loop_transform_p): New function. (gfc_expr_list_len): New function. (gfc_trans_omp_do): Handle EXEC_OMP_TILE. (gfc_trans_omp_directive): Likewise. * trans.cc (trans_code): Likewise. gcc/ChangeLog: * gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_loop): Likewise. * omp-transform-loops.cc (walk_omp_for_loops): New declaration. (subst_var_in_op): New function. (subst_var): New function. (gomp_for_number_of_iterations): Adjust. (gomp_for_iter_count_type): New function. (gimple_assign_rhs_to_tree): New function. (subst_defs): New function. (gomp_for_uncollapse): Adjust. (transformation_clause_p): Add OMP_CLAUSE_TILE. (tile): New function. (transform_gomp_for): Handle OMP_CLAUSE_TILE. (optimize_transformation_clauses): Handle OMP_CLAUSE_TILE. * omp-general.cc (omp_loop_transform_clause_p): Add OMP_CLAUSE_TILE. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE. * tree.cc: Add OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_SIZES): New macro. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-2.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-3.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-4.f90: New test. * gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test.
2023-03-27openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILEFrederik Harwath16-47/+47
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_TILE_ITERVAR): Rename to ... (OMP_CLAUSE_OACC_TILE_ITERVAR): ... this. (OMP_CLAUSE_TILE_COUNT): Rename to ... (OMP_CLAUSE_OACC_TILE_COUNT): this. * gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings. (gimplify_adjust_omp_clauses): Likewise. (gimplify_omp_for): Likewise. * omp-general.cc (omp_extract_for_data): Likewise. * omp-low.cc (scan_sharing_clauses): Likewise. (lower_oacc_head_mark): Likewise. * tree-nested.cc (convert_nonlocal_omp_clauses): Likewise. (convert_local_omp_clauses): Likewise. * tree-pretty-print.cc (dump_omp_clause): Likewise. * tree.cc: Likewise. gcc/c-family/ChangeLog: * c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings. (c_parser_oacc_clause_tile): Likewise. (c_parser_omp_for_loop): Likewise. * c-typeck.cc (c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings. (cp_parser_omp_clause_collapse): Likewise. (cp_parser_omp_for_loop): Likewise. * pt.cc (tsubst_omp_clauses): Likewise. * semantics.cc (finish_omp_clauses): Likewise. (finish_omp_for): Likewise. gcc/fortran/ChangeLog: * openmp.cc (enum omp_mask2): Adjust to renamings. (gfc_match_omp_clauses): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
2023-03-27openmp: Add C/C++ support for "omp unroll" directiveFrederik Harwath24-8/+1245
This commit implements the C and the C++ front end changes to support the "omp unroll" directive. The execution of the loop transformation relies on the pass that has been added as a part of the earlier Fortran patch. gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): Handle OMP_UNROLL. * c-omp.cc: Add "unroll" to omp_directives[]. * c-pragma.cc: Add "unroll" to omp_pragmas_simd[]. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_UNROLL to pragma_kind and adjust PRAGMA_OMP__LAST_. (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_name): Handle "full" and "partial" clauses. (check_no_duplicate_clause): Change return type to bool and return check result. (c_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (c_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL and PRAGMA_OMP_CLAUSE_PARTIAL. (c_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (OMP_UNROLL_CLAUSE_MASK): New definition. (c_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (c_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. * c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. gcc/cp/ChangeLog: * cp-gimplify.cc (cp_gimplify_expr): Handle OMP_UNROLL. (cp_fold_r): Likewise. (cp_genericize_r): Likewise. * parser.cc (cp_parser_omp_clause_name): Handle "full" clause. (check_no_duplicate_clause): Change return type to bool and return check result. (cp_parser_omp_clause_unroll_full): New function for parsing the "unroll clause". (cp_parser_omp_clause_unroll_partial): New function for parsing the "partial" clause. (cp_parser_omp_all_clauses): Handle OMP_CLAUSE_UNROLL and OMP_CLAUSE_FULL. (cp_parser_nested_omp_unroll_clauses): New function for parsing "omp unroll" directives following another directive. (cp_parser_omp_for_loop): Handle "omp unroll" directives between directive and loop. (OMP_UNROLL_CLAUSE_MASK): New definition. (cp_parser_omp_unroll): New function for parsing "omp unroll" loops that are not associated with another directive. (cp_parser_omp_construct): Handle PRAGMA_OMP_UNROLL. (cp_parser_pragma): Handle PRAGMA_OMP_UNROLL. * pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_NONE. (tsubst_expr): Handle OMP_UNROLL. * semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_UNROLL_NONE. libgomp/ChangeLog: * testsuite/libgomp.c++/loop-transforms/unroll-1.C: New test. * testsuite/libgomp.c++/loop-transforms/unroll-2.C: New test. * testsuite/libgomp.c-c++-common/loop-transforms/unroll-1.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/loop-transforms/unroll-1.c: New test. * c-c++-common/gomp/loop-transforms/unroll-2.c: New test. * c-c++-common/gomp/loop-transforms/unroll-3.c: New test. * c-c++-common/gomp/loop-transforms/unroll-4.c: New test. * c-c++-common/gomp/loop-transforms/unroll-5.c: New test. * c-c++-common/gomp/loop-transforms/unroll-6.c: New test. * g++.dg/gomp/loop-transforms/unroll-1.C: New test. * g++.dg/gomp/loop-transforms/unroll-2.C: New test. * g++.dg/gomp/loop-transforms/unroll-3.C: New test.
2023-03-27openmp: Add Fortran support for "omp unroll" directiveFrederik Harwath53-17/+3471
This commit implements the OpenMP 5.1 "omp unroll" directive for Fortran. The Fortran front end changes encompass the parsing and the verification of nesting restrictions etc. The actual loop transformation is implemented in a new language-independent "omp_transform_loops" pass which runs before omp lowering. No attempt is made to re-use existing unrolling optimizations because a separate implementation allows for better control of the unrolling. The new pass will also serve as a foundation for the implementation of further OpenMP loop transformations. This commit only implements the support for "omp unroll" on the outermost loop of a loop nest. The support for inner loops will be added later. gcc/ChangeLog: * Makefile.in: Add omp_transform_loops.o. * gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full" and "partial" clauses. * gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP. * gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL. (gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL, OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL. (gimplify_omp_for): Handle OMP_UNROLL. (gimplify_expr): Likewise. * params.opt: Add omp-unroll-full-max-iteration and omp-unroll-default-factor. * passes.def: Add pass_omp_transform_loop before pass_lower_omp. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. * tree-pass.h (make_pass_omp_transform_loops): Declare pmake_pass_omp_transform_loops. * tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and OMP_CLAUSE_UNROLL_PARTIAL. (dump_generic_node): Handle OMP_UNROLL. * tree.cc (omp_clause_num_ops): Add number of operators for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAl. (omp_clause_code_names): Add name strings for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and OMP_CLAUSE_UNROLL_PARTIAL. * tree.def (OMP_UNROLL): Define. * tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define. * omp-transform-loops.cc: New file. * omp-general.cc (omp_loop_transform_clause_p): New function. * omp-general.h (omp_loop_transform_clause_p): New declaration. gcc/fortran/ChangeLog: * dump-parse-tree.cc (show_omp_clauses): Handle "unroll full" and "unroll partial". (show_omp_node): Handle OMP_UNROLL. (show_code_node): Handle EXEC_OMP_UNROLL. * gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL. (enum gfc_exec_op): Add EXEC_OMP_UNROLL. * match.h (gfc_match_omp_unroll): Declare. * openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL. (gfc_match_omp_clauses): Handle "omp unroll partial". (OMP_UNROLL_CLAUSES): New macro definition. (gfc_match_omp_unroll): Match "full" clause. (omp_unroll_removes_loop_nest): New function. (resolve_omp_unroll): New function. (resolve_omp_do): Accept and verify "omp unroll" directives between directive and loop. (omp_code_to_statement): Handle EXEC_OMP_UNROLL. (gfc_resolve_omp_directive): Likewise. * parse.cc (decode_omp_directive): Handle "undroll" and "end unroll". (next_statement): Handle ST_OMP_UNROLL. (gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL. (parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL before/after loop. (parse_executable): Handle ST_OMP_UNROLL. * resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL. (gfc_resolve_code): Likewise. * st.cc (gfc_free_statement): Likewise. * trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses. (gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation. (gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL. * trans.cc (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test. * gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test.
2023-03-24Add 'libgomp.c/alloc-ompx_host_mem_alloc-1.c'Thomas Schwinge2-0/+79
OpenMP 'ompx_host_mem_alloc' is available for host and nvptx offloading as of og12 commit 84914e197d91a67b3d27db0e4c69a433462983a5 "openmp, nvptx: ompx_unified_shared_mem_alloc", and for GCN offloading as of og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd "amdgcn, libgomp: low-latency allocator". libgomp/ * testsuite/libgomp.c/alloc-ompx_host_mem_alloc-1.c: New.
2023-03-24Miscellaneous clean-up re OpenMP 'ompx_host_mem_space'Thomas Schwinge2-0/+7
Like done for nvptx in og12 commit 23f52e49368d7b26a1b1a72d6bb903d31666e961 "Miscellaneous clean-up re OpenMP 'ompx_unified_shared_mem_space', 'ompx_host_mem_space'". Clean-up for og12 commit c77c45a641fedc3fe770e909cc010fb1735bdbbd "amdgcn, libgomp: low-latency allocator". No functional change. libgomp/ * config/gcn/allocator.c (gcn_memspace_free): Explicitly handle 'memspace == ompx_host_mem_space'.
2023-03-24Add caveat/safeguard to OpenMP: Handle descriptors in target's firstprivate ↵Thomas Schwinge2-0/+12
[PR104949] Follow-up to commit 49d1a2f91325fa8cc011149e27e5093a988b3a49 "OpenMP: Handle descriptors in target's firstprivate [PR104949]". PR fortran/104949 libgomp/ * target.c (gomp_map_vars_internal) <GOMP_MAP_FIRSTPRIVATE>: Add caveat/safeguard. (cherry picked from commit e8fec6998b656dac02d4bc6c69b35a0fb5611e87)
2023-03-24libgomp: Simplify OpenMP reverse offload host <-> device memory copy ↵Thomas Schwinge7-103/+106
implementation ... by using the existing 'goacc_asyncqueue' instead of re-coding parts of it. Follow-up to commit 131d18e928a3ea1ab2d3bf61aa92d68a8a254609 "libgomp/nvptx: Prepare for reverse-offload callback handling", and commit ea4b23d9c82d9be3b982c3519fe5e8e9d833a6a8 "libgomp: Handle OpenMP's reverse offloads". libgomp/ * target.c (gomp_target_rev): Instead of 'dev_to_host_cpy', 'host_to_dev_cpy', 'token', take a single 'goacc_asyncqueue'. * libgomp.h (gomp_target_rev): Adjust. * libgomp-plugin.c (GOMP_PLUGIN_target_rev): Adjust. * libgomp-plugin.h (GOMP_PLUGIN_target_rev): Adjust. * plugin/plugin-gcn.c (process_reverse_offload): Adjust. * plugin/plugin-nvptx.c (rev_off_dev_to_host_cpy) (rev_off_host_to_dev_cpy): Remove. (GOMP_OFFLOAD_run): Adjust.
2023-03-24In 'libgomp/target.c:gomp_unmap_vars_internal', defer 'gomp_remove_var'Thomas Schwinge2-15/+22
An upcoming change requires that 'gomp_remove_var' be deferred until after all 'gomp_copy_dev2host' calls have been handled. Do this likewise to how commit 275c736e732d29934e4d22e8f030d5aae8c12a52 "libgomp: Structure element mapping for OpenMP 5.0" changed 'gomp_exit_data'. libgomp/ * target.c (gomp_unmap_vars_internal): Queue splay-tree keys for removal after main loop.
2023-03-24Given OpenACC 'async', defer 'free' of non-contiguous array support data ↵Thomas Schwinge3-2/+16
structures [PR76739] Fix-up for og12 commit 15d0f61a7fecdc8fd12857c40879ea3730f6d99f "Merge non-contiguous array support patches". PR other/76739 libgomp/ * oacc-parallel.c (GOACC_parallel_keyed): Given OpenACC 'async', defer 'free' of non-contiguous array support data structures. * target.c (gomp_map_vars_internal): Likewise.
2023-03-23Fortran/OpenMP: Fix 'alloc' and 'from' mapping for allocatable componentsTobias Burnus4-7/+371
Even with 'alloc' and map-entering 'from' mapping, the following should hold. For explicit mapping, that's already the case, this handles the automatical deep mapping of allocatable components. Namely: * On the device, the array bounds (of allocated allocatables) must match the host, implying 'to' (or 'tofrom') mapping. * On map exiting, the copying out shall not destroy the unallocated allocation status (nor the pointer address of allocated allocatables). The latter was not a problem for allocated allocatables as for those a pointer was GOMP_MAP_ATTACHed; however, for unallocated allocatables, before it copied back device-allocated memory which might not be nullified. While 'alloc' was not deep-mapped at all, for map-entering 'from', the array bounds were not set, making allocated derived-type components inaccessible on the device (and wrong on the host on copy back). The solution is, first, to deep-map 'alloc' as well and to copy to the device even with 'alloc' and (map-entering) 'from'. This copying is only done if there is a scalar (for the unallocated case) or array allocatable directly in the derived type and then it is shallowly copied; the data pointed to is then again only alloc'ed, unless it contains in turn allocatables. gcc/fortran/ * trans-openmp.cc (gfc_has_alloc_comps): Add 'bool shallow_alloc_only=false' arg. (gfc_omp_replace_alloc_by_to_mapping): New, call it. (gfc_omp_deep_map_kind_p): Return 'true' also for '(present,)alloc'. (gfc_omp_deep_mapping_item, gfc_omp_deep_mapping_do): On map entering, replace shallowly 'alloc'/'from' by '(from)to' mapping if there are allocatable components. libgomp/ * testsuite/libgomp.fortran/map-alloc-comp-8.f90: New test.
2023-03-23Fortran: Add attr.class_ok check for generate_callback_wrapperTobias Burnus3-2/+11
Proper variables/components of type BT_CLASS have 'class_ok' set; check for that to avoid an ICE on invalid code for gfortran.dg/pr108434.f90. gcc/fortran/ * class.cc (generate_callback_wrapper): Add attr.class_ok check. * resolve.cc (resolve_fl_derived): Likewise.
2023-03-23OpenMP/Fortran: Fix unmapping of GOMP_MAP_POINTER for scalar ↵Tobias Burnus4-13/+111
allocatables/pointers target exit data: Do unmap GOMP_MAP_POINTER for scalar allocatables/pointers to prevent stale mappings. While for allocatable/pointer arrays, there is a PSET followed by POINTER, for allocatable/pointer scalars there is only a POINTER. Before the below mentioned OG12 patch: For exit data, PSET was converted to RELEASE/DELETE in gimplify.cc while all POINTER were removed; correct for arrays but leaving POINTER behind for scalars. Since that commit, all in trans-openmp.cc but the scalar case was still mishandled before this follow-up commit. This is a follow up to OG12's 55a18d4744258e3909568e425f9f473c49f9d13f While the problem is independent, it will be merged into v4 of the mainline patch 'Fortran/OpenMP: Fix mapping of array descriptors and deferred-length strings' gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Fix unmapping of GOMP_MAP_POINTER for scalar allocatables/pointers. gcc/testsuite/ * gfortran.dg/gomp/map-10.f90: New test.
2023-03-22amdgcn: Add instruction patterns for complex number operations.Andrew Jenner7-0/+995
gcc/ChangeLog: * config/gcn/gcn-protos.h (gcn_expand_dpp_swap_pairs_insn) (gcn_expand_dpp_distribute_even_insn) (gcn_expand_dpp_distribute_odd_insn): Declare. * config/gcn/gcn-valu.md (@dpp_swap_pairs<mode>) (@dpp_distribute_even<mode>, @dpp_distribute_odd<mode>) (cmul<conj_op><mode>3, cml<addsub_as><mode>4, vec_addsub<mode>3) (cadd<rot><mode>3, vec_fmaddsub<mode>4, vec_fmsubadd<mode>4) (fms<mode>4<exec>, fms<mode>4_negop2<exec>, fms<mode>4) (fms<mode>4_negop2): New patterns. * config/gcn/gcn.cc (gcn_expand_dpp_swap_pairs_insn) (gcn_expand_dpp_distribute_even_insn) (gcn_expand_dpp_distribute_odd_insn): New functions. * config/gcn/gcn.md: Add entries to unspec enum. gcc/testsuite/ChangeLog: * gcc.target/gcn/complex.c: New test.
2023-03-17amdgcn: Fix register size bugAndrew Stubbs2-0/+18
Fix an issue in which "vectors" of duplicate entries placed in scalar registers caused the following 63 registers to be marked live, for the purpose of prologue generation, which resulted in stack corruption. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_class_max_nregs): Handle vectors in SGPRs. (move_callee_saved_registers): Detect the bug condition early.
2023-03-17vect: Fix missed gather load opportunityRichard Sandiford4-0/+56
While writing a testcase for PR106794, I noticed that we failed to vectorise the testcase in the patch for SVE. The code that recognises gather loads tries to optimise the point at which the offset is calculated, to avoid unnecessary extensions or truncations: /* Don't include the conversion if the target is happy with the current offset type. */ But breaking only makes sense if we're at an SSA_NAME (which could then be vectorised). We shouldn't break on a conversion embedded in a generic expression. gcc/ * tree-vect-data-refs.cc (vect_check_gather_scatter): Restrict early-out optimisation to SSA_NAMEs. gcc/testsuite/ * gcc.dg/vect/vect-gather-5.c: New test. (cherry picked from commit 4a773bf2f08656a39ac75cf6b4871c8cec8b5007)
2023-03-17amdgcn: gather/scatter with DImode offsetsAndrew Stubbs2-0/+130
The GPU architecture requires SImode offsets on gather/scatter instructions, but they can also take a vector of absolute addresses, so this allows gather/scatter in more situations. gcc/ChangeLog: * config/gcn/gcn-valu.md (gather_load<mode><vndi>): New. (scatter_store<mode><vndi>): New. (mask_gather_load<mode><vndi>): New. (mask_scatter_store<mode><vndi>): New.
2023-03-17amdgcn: vec_extract no-op insnsAndrew Stubbs5-9/+89
Just using move insn for no-op conversions triggers special move handling in IRA which declares that subreg of vectors aren't valid and routes everything through memory. These patterns make the vec_select explicit and all is well. gcc/ChangeLog: * config/gcn/gcn-protos.h (gcn_stepped_zero_int_parallel_p): New. * config/gcn/gcn-valu.md (V_1REG_ALT): New. (V_2REG_ALT): New. (vec_extract<V_1REG:mode><V_1REG_ALT:mode>_nop): New. (vec_extract<V_2REG:mode><V_2REG_ALT:mode>_nop): New. (vec_extract<V_ALL:mode><V_ALL_ALT:mode>): Use new patterns. * config/gcn/gcn.cc (gcn_stepped_zero_int_parallel_p): New. * config/gcn/predicates.md (ascending_zero_int_parallel): New.
2023-03-10Use 'GOMP_MAP_VARS_TARGET' for OpenACC compute constructs [PR90596]Thomas Schwinge6-238/+62
Thereby considerably simplify the device plugins' 'GOMP_OFFLOAD_openacc_exec', 'GOMP_OFFLOAD_openacc_async_exec' functions: in terms of lines of code, but in particular conceptually: no more device memory allocation, host to device data copying, device memory deallocation -- 'GOMP_MAP_VARS_TARGET' does all that for us. This depends on commit 2b2340e236c0bba8aaca358ea25a5accd8249fbd "Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data", where I said that "a use will emerge later", which is this one here. PR libgomp/90596 libgomp/ * target.c (gomp_map_vars_internal): Allow for 'param_kind == GOMP_MAP_VARS_OPENACC | GOMP_MAP_VARS_TARGET'. * oacc-parallel.c (GOACC_parallel_keyed): Pass 'GOMP_MAP_VARS_TARGET' to 'goacc_map_vars'. * plugin/plugin-gcn.c (alloc_by_agent, gcn_exec) (GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify. (gomp_offload_free): Remove. * plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Adjust, simplify. (cuda_free_argmem): Remove. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Adjust. (cherry picked from commit f8332e52a498df480f72303de32ad0751ad899fe)
2023-03-10Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' dataThomas Schwinge2-34/+43
This does *allow*, but under no circumstances is this currently going to be used: all potentially applicable data is non-'ephemeral', and thus not considered for 'gomp_coalesce_buf_add' for OpenACC 'async'. (But a use will emerge later.) Follow-up to commit r12-2530-gd88a6951586c7229b25708f4486eaaf4bf4b5bbe "Don't use libgomp 'cbuf' buffering with OpenACC 'async'", addressing this TODO comment: TODO ... but we could allow CBUF usage for EPHEMERAL data? (Open question: is it more performant to use libgomp CBUF buffering or individual device asyncronous copying?) Ephemeral data is small, and therefore individual device asyncronous copying does seem dubious -- in particular given that for all those, we'd individually have to allocate and queue for deallocation a temporary buffer to capture the ephemeral data. Instead, just let the 'cbuf' *be* the temporary buffer. libgomp/ * target.c (gomp_copy_host2dev, gomp_map_vars_internal): Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data. (cherry picked from commit 2b2340e236c0bba8aaca358ea25a5accd8249fbd)
2023-03-10Simplify OpenACC 'no_create' clause implementationThomas Schwinge6-27/+67
For 'OFFSET_INLINED', 'gomp_map_val' does the right thing, and we may then simplify the device plugins accordingly. This is a follow-up to Subversion r279551 (Git commit a6163563f2ce502bd4ef444bd5de33570bb8eeb1) "Add OpenACC 2.6's no_create", Subversion r279622 (Git commit 5bcd470bf0749e1f56d05dd43aa9584ff2e3a090) "Use gomp_map_val for OpenACC host-to-device address translation". libgomp/ * target.c (gomp_map_vars_internal): Use 'OFFSET_INLINED' for 'GOMP_MAP_IF_PRESENT'. * plugin/plugin-gcn.c (gcn_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Adjust. * plugin/plugin-nvptx.c (nvptx_exec, GOMP_OFFLOAD_openacc_exec) (GOMP_OFFLOAD_openacc_async_exec): Likewise. * testsuite/libgomp.oacc-c-c++-common/no_create-1.c: Add 'async' testing. * testsuite/libgomp.oacc-c-c++-common/no_create-2.c: Likewise. (cherry picked from commit 199867d07be65cb0227a318ebf42b8376ca09313)
2023-03-10OpenACC: Remove 'acc_async_test' -> skip shortcut in ↵Thomas Schwinge2-3/+6
'libgomp/oacc-async.c:goacc_wait' We're not taking such a shortcut anywhere else, and (with future changes) it has potential to confuse things if synchronization in a libgomp plugin happens to have side effects even if an async queue currently is empty. libgomp/ * oacc-async.c (goacc_wait): Remove 'acc_async_test' -> skip shortcut. (cherry picked from commit b5037d4a073f2e4625afab5ec1f35624d9f9eba1)
2023-03-10Document/verify another aspect of OpenACC 'async' semantics in ↵Thomas Schwinge2-2/+8
'libgomp.oacc-c-c++-common/data-3.c' ... that I almost broke with later implementation changes. libgomp/ * testsuite/libgomp.oacc-c-c++-common/data-3.c: Document/verify another aspect of OpenACC 'async' semantics. (cherry picked from commit 442d51a20ef13a8e6c080ca30bc37fc93b6bfac4)
2023-03-10Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' positionThomas Schwinge3-30/+203
For an OpenACC compute construct, we've currently got: - [...] - acc_ev_enqueue_launch_start - launch kernel - free memory - acc_ev_free - acc_ev_enqueue_launch_end This confused another thing that I'm working on, so I adjusted that to: - [...] - acc_ev_enqueue_launch_start - launch kernel - acc_ev_enqueue_launch_end - free memory - acc_ev_free Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'. libgomp/ * plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end' position. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Verify 'acc_ev_alloc', 'acc_ev_free'. (cherry picked from commit 649f1939baf11f45fd3579b8b9601c7840a097b3)