Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
The gfx803 "Fiji" device was deprecated in GCC 14, removed from LLVM 18, and
hasn't worked properly with the drivers since about ROCm 4.
This patch removes the device from GCC options and documentation, and removes
the direct mentions from the internals.
The TARGET_GCN3 support in the back-end is now unused and can be removed (in a
follow-up patch).
gcc/ChangeLog:
* config.gcc (amdgcn-*-*): Remove "fiji" from with_arch checks.
* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove fiji alternative.
(NO_XNACK): Likewise.
(NO_SRAM_ECC): Likewise.
(ASM_SPEC): Remove "%{}" around ABI_VERSION_SPEC.
* config/gcn/gcn-opts.h (enum processor_type): Remove PROCESSOR_FIJI.
(TARGET_FIJI): Delete.
* config/gcn/gcn.cc (gcn_option_override): Remove Fiji.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Likewise.
* config/gcn/gcn.opt (gpu_type): Likewise.
(march, mtune): Change default to PROCESSOR_VEGA10.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX803): Delete.
(copy_early_debug_info): Remove elf_flags_actual.
Use ELFABIVERSION_AMDGPU_HSA_V4 unconditionally.
(get_arch): Remove Fiji.
(main): Remove gfx803.
* config/gcn/t-omp-device
(omp-device-properties-gcn): Remove fiji and gfx803.
* doc/install.texi (amdgcn*-*-*): Remove fiji and special instructions.
* doc/invoke.texi: Remove fiji.
libgomp/ChangeLog:
* libgomp.texi: Remove fiji and gfx803.
* testsuite/libgomp.c/declare-variant-4.h: Remove fiji and gfx803.
* testsuite/libgomp.c/declare-variant-4-fiji.c: Removed.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: Removed.
|
|
Since r15-3254-g3f51f0dc88ec21c1ec79df694200f10ef85915f4
added scan-ltrans-rtl* variants to scanltranstree.exp, it no longer
makes sense to have "tree" in the name. This renames the file
accordingly and updates users.
libatomic/ChangeLog:
* testsuite/lib/libatomic.exp: Load scanltrans.exp instead of
scanltranstree.exp.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp: Load scanltrans.exp instead of
scanltranstree.exp.
libitm/ChangeLog:
* testsuite/lib/libitm.exp: Load scanltrans.exp instead of
scanltranstree.exp.
libphobos/ChangeLog:
* testsuite/lib/libphobos-dg.exp: Load scanltrans.exp instead of
scanltranstree.exp.
libvtv/ChangeLog:
* testsuite/lib/libvtv.exp: Load scanltrans.exp instead of
scanltranstree.exp.
gcc/testsuite/ChangeLog:
* gcc.dg-selftests/dg-final.exp: Load scanltrans.exp instead of
scanltranstree.exp.
* lib/gcc-dg.exp: Likewise.
* lib/scanltranstree.exp: Rename to ...
* lib/scanltrans.exp: ... this.
|
|
|
|
This commit adds OpenMP 5.1+'s interop enumeration, type and routine
declarations to the C/C++ header file and, new in OpenMP TR13, also to
the Fortran module and omp_lib.h header file.
While a stub implementation is provided, only with foreign runtime
support by the libgomp GPU plugins and with the 'interop' directive,
this becomes really useful.
libgomp/ChangeLog:
* fortran.c (omp_get_interop_str_, omp_get_interop_name_,
omp_get_interop_type_desc_, omp_get_interop_rc_desc_): Add.
* libgomp.map (GOMP_5.1.3): New; add interop routines.
* omp.h.in: Add interop typedefs, enum and prototypes.
(__GOMP_DEFAULT_NULL): Define.
(omp_target_memcpy_async, omp_target_memcpy_rect_async):
Use it for the optional depend argument.
* omp_lib.f90.in: Add paramters and interfaces for interop.
* omp_lib.h.in: Likewise; move F90 '&' to column 81 for
-ffree-length-80.
* target.c (omp_get_num_interop_properties, omp_get_interop_int,
omp_get_interop_ptr, omp_get_interop_str, omp_get_interop_name,
omp_get_interop_type_desc, omp_get_interop_rc_desc): Add.
* config/gcn/target.c (omp_get_num_interop_properties,
omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
omp_get_interop_name, omp_get_interop_type_desc,
omp_get_interop_rc_desc): Add.
* config/nvptx/target.c (omp_get_num_interop_properties,
omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str,
omp_get_interop_name, omp_get_interop_type_desc,
omp_get_interop_rc_desc): Add.
* testsuite/libgomp.c-c++-common/interop-routines-1.c: New test.
* testsuite/libgomp.c-c++-common/interop-routines-2.c: New test.
* testsuite/libgomp.fortran/interop-routines-1.F90: New test.
* testsuite/libgomp.fortran/interop-routines-2.F90: New test.
* testsuite/libgomp.fortran/interop-routines-3.F: New test.
* testsuite/libgomp.fortran/interop-routines-4.F: New test.
* testsuite/libgomp.fortran/interop-routines-5.F: New test.
* testsuite/libgomp.fortran/interop-routines-6.F: New test.
* testsuite/libgomp.fortran/interop-routines-7.F90: New test.
|
|
|
|
Fix effective-target keyword in test cases
(Most of) the tests added in commit f1bfba3a9b3f31e3e06bfd1911c9f223869ea03f
"OpenMP: Constructors and destructors for "declare target" static aggregates"
had a mismatch between dump file production and its scanning; the former needs
to use 'offload_target_nvptx' (like 'offload_target_amdgcn'), not
'offload_device_nvptx'.
libgomp/
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C:
Fix effective-target keyword.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
Likewise.
* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c:
Likewise.
* testsuite/libgomp.c-c++-common/target-is-initial-host.c:
Likewise.
* testsuite/libgomp.fortran/target-is-initial-host-2.f90:
Likewise.
* testsuite/libgomp.fortran/target-is-initial-host.f: Likewise.
* testsuite/libgomp.fortran/target-is-initial-host.f90: Likewise.
|
|
|
|
libgomp/ChangeLog:
* libgomp.texi (OpenMP Technical Report 13): Renamed from
'OpenMP Technical Report 12'; updated for TR13 changes.
|
|
libgomp/ChangeLog:
* libgomp.texi (omp_is_initial_device): Mention
-fno-builtin-omp_is_initial_device and folding by default.
|
|
In principle, the optimized dump should be the same on the host, but as
'nohost' is not handled, is is present. However when ENABLE_OFFLOADING is
false, it is handled early enough to remove the function.
libgomp/ChangeLog:
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: Split
scan-tree-dump into with and without target offload_target_any.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C:
Likewise.
|
|
|
|
This commit also compile-time expands (__builtin_)omp_is_initial_device for
both Fortran and C/C++ (unless, -fno-builtin-omp_is_initial_device is used).
But the main change is:
This commit adds support for running constructors and destructors for
static (file-scope) aggregates for C++ objects which are marked with
"declare target" directives on OpenMP offload targets.
Before this commit, space is allocated on the target for such aggregates,
but nothing ever constructs them properly, so they end up zero-initialised.
(See the new test static-aggr-constructor-destructor-3.C for a reason
why running constructors on the target is preferable to e.g. constructing
on the host and then copying the resulting object to the target.)
2024-08-07 Julian Brown <julian@codesourcery.com>
Tobias Burnus <tobias@baylibre.com>
gcc/ChangeLog:
* builtins.def (DEF_GOMP_BUILTIN_COMPILER): Define
DEF_GOMP_BUILTIN_COMPILER to handle the non-prefix version.
* gimple-fold.cc (gimple_fold_builtin_omp_is_initial_device): New.
(gimple_fold_builtin): Call it.
* omp-builtins.def (BUILT_IN_OMP_IS_INITIAL_DEVICE): Define.
* tree.cc (get_file_function_name): Support names for on-target
constructor/destructor functions.
gcc/cp/
* decl2.cc (tree-inline.h): Include.
(static_init_fini_fns): Bump to four entries. Update comment.
(start_objects, start_partial_init_fini_fn): Add 'omp_target'
parameter. Support "declare target" decls. Update forward declaration.
(emit_partial_init_fini_fn): Add 'host_fn' parameter. Return tree for
the created function. Support "declare target".
(OMP_SSDF_IDENTIFIER): New macro.
(partition_vars_for_init_fini): Support partitioning "declare target"
variables also.
(generate_ctor_or_dtor_function): Add 'omp_target' parameter. Support
"declare target" decls.
(c_parse_final_cleanups): Support constructors/destructors on OpenMP
offload targets.
gcc/fortran/ChangeLog:
* gfortran.h (gfc_option_t): Add disable_omp_is_initial_device.
* lang.opt (fbuiltin-): Add.
* options.cc (gfc_handle_option): Handle
-fno-builtin-omp_is_initial_device.
* f95-lang.cc (gfc_init_builtin_functions): Handle
DEF_GOMP_BUILTIN_COMPILER.
* trans-decl.cc (gfc_get_extern_function_decl): Add code to use
DEF_GOMP_BUILTIN_COMPILER for 'omp_is_initial_device'.
libgomp/ChangeLog:
* testsuite/libgomp.c++/static-aggr-constructor-destructor-1.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-2.C: New test.
* testsuite/libgomp.c++/static-aggr-constructor-destructor-3.C: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host.c: New test.
* testsuite/libgomp.c-c++-common/target-is-initial-host-2.c: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f: New test.
* testsuite/libgomp.fortran/target-is-initial-host.f90: New test.
* testsuite/libgomp.fortran/target-is-initial-host-2.f90: New test.
Co-authored-by: Tobias Burnus <tobias@baylibre.com>
|
|
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/target-link-2.c: Reset variable
value to handle multi-device tests.
|
|
|
|
The run time library loads the offload functions and variable and optionally
the ICV variable and returns the number of loaded items, which has to match
the host side. The plugin returns "+1" (since GCC 12) for the ICV variable
entry, independently whether it was loaded or not, but the var's value
(start == end == 0) can be used to detect when this failed.
Thus, we can tighten the assert check - which this commit does together with
making the output less surprising - and simplify the condition further below.
libgomp/ChangeLog:
* target.c (gomp_load_image_to_device): Extend fatal-error message;
simplify a condition.
|
|
|
|
To keep track of missing routine documentation (both implemented and not),
the libgomp.texi file contains all non-OMPT routines as commented items
in @menu. This commit adds the routines added in TR13 as commented fixme
items.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Runtime Library Routines): Add TR13 routines
to @menu (commented out).
|
|
|
|
As the PR and included testcase shows, replacing 'arr2' by its value expression
'*arr2$13$linkptr' failed for
MEM <uint128_t> [(c_char * {ref-all})&arr2]
which left 'arr2' in the code as unknown symbol. Now expand the value expression
already in pass_omp_target_link::execute's process_link_var_op walk_gimple_stmt
walk - and don't rely on gimple_regimplify_operands.
PR middle-end/115637
gcc/ChangeLog:
* gimplify.cc (gimplify_body): Fix macro name in the comment.
* omp-offload.cc (find_link_var_op): Rename to ...
(process_link_var_op): ... this. Replace value expr.
(pass_omp_target_link::execute): Update walk_gimple_stmt call.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/declare-target-link.f90: Uncomment
now working code.
Co-authored-by: Richard Biener <rguenther@suse.de
|
|
|
|
Per gccint, 'dg-require-effective-target' must come before any
'dg-additional-sources' directives. Fix a handful of deviant cases.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/aapcs64/func-ret-3.c: Fix dg-require-effective-target directive order.
* gcc.target/aarch64/aapcs64/func-ret-4.c: Likewise.
* gfortran.dg/PR100914.f90: Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.c++/pr24455.C: Fix dg-require-effective-target directive order.
* testsuite/libgomp.c/pr24455.c: Likewise.
|
|
'dg-run' is not a valid dejagnu directive, 'dg-do run' is needed here
for the test to be executed.
That said, it actually seems to be executed for me anyway, presumably
a default in the directory, but let's fix it to be consistent with
other uses in the tree and in that test directory even.
libgomp/ChangeLog:
* testsuite/libgomp.c++/declare-target-indirect-1.C: Fix 'dg-run' typo.
|
|
|
|
Update 'OpenMP Runtime Library Routines' by adding a note that invoking
inside a target region might invoke unspecified behavior. Additionally,
update omp_{get,set}_default_device for omp_{initial,invalid}_device
named constants.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Runtime Library Routines): Add missing
title to some commented still undocumented items.
(Device Information Routines): Update.
|
|
Contrary to a normal 'declare target', the 'declare target link' attribute
also needs to set node->offloadable and push the offload_vars in the front end.
Linked variables require that the data is mapped. For module variables, this
can happen anywhere. For variables in an external subprograms or the main
programm, this can only happen in the either that program itself or in an
internal subprogram. - Whether a variable is just normally mapped or linked then
becomes relevant if a device routine exists that can access that variable,
i.e. an internal procedure has then to be marked as declare target.
PR fortran/115559
gcc/fortran/ChangeLog:
* trans-common.cc (build_common_decl): Add 'omp declare target' and
'omp declare target link' variables to offload_vars.
* trans-decl.cc (add_attributes_to_decl): Likewise; update args and
call decl_attributes.
(get_proc_pointer_decl, gfc_get_extern_function_decl,
build_function_decl): Update calls.
(gfc_get_symbol_decl): Likewise; move after 'DECL_STATIC (t)=1'
to avoid errors with symtab_node::get_create.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/declare-target-link.f90: New test.
|
|
Assume that 'int var[100]' is 'omp declare target link(var)'. When now
mapping an array section with offset such as 'map(to:var[20:10])',
the device-side link pointer has to store &<device-storage-data>[0] minus
the offset such that var[20] will access <device-storage-data>[0]. But
the offset calculation was missed such that the device-side 'var' pointed
to the first element of the mapped data - and var[20] points beyond at
some invalid memory.
PR middle-end/116107
libgomp/ChangeLog:
* target.c (gomp_map_vars_internal): Honor array mapping offsets
with declare-target 'link' variables.
* testsuite/libgomp.c-c++-common/target-link-2.c: New test.
|
|
|
|
For reference:
- <https://inbox.sourceware.org/20211111190313.GV2710@tucnak> "[PATCH] openmp: Honor OpenMP 5.1 num_teams lower bound"
- <https://inbox.sourceware.org/20211112132023.GC2710@tucnak> "[PATCH] libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound"
libgomp/
* config/gcn/target.c (GOMP_teams4): Document.
* config/nvptx/target.c (GOMP_teams4): Likewise.
* target.c (GOMP_teams4): Likewise.
|
|
Corresponding to commit 9fa72756d90e0d9edadf6e6f5f56476029925788
"libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound", these are the
GCN offloading changes to fix:
PASS: libgomp.c/../libgomp.c-c++-common/teams-2.c (test for excess errors)
[-FAIL:-]{+PASS:+} libgomp.c/../libgomp.c-c++-common/teams-2.c execution test
PASS: libgomp.c++/../libgomp.c-c++-common/teams-2.c (test for excess errors)
[-FAIL:-]{+PASS:+} libgomp.c++/../libgomp.c-c++-common/teams-2.c execution test
..., and omptests' 't-critical' test case. I've cross checked that those test
cases are the ones that regress for nvptx offloading, if I locally revert the
"libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound" changes.
libgomp/
* config/gcn/libgomp-gcn.h (GOMP_TEAM_NUM): Inject.
* config/gcn/target.c (GOMP_teams4): Handle.
* config/gcn/team.c (gomp_gcn_enter_kernel): Initialize.
* config/gcn/teams.c (omp_get_team_num): Adjust.
|
|
2024-07-19 Paul Thomas <pault@gcc.gnu.org>
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Cut
dg-note about 'a' and remove bogus warnings about its array
descriptor components being used uninitialized.
|
|
|
|
This creates a new predefined allocator as a shortcut for using pinned
memory with OpenMP. This is not in the OpenMP standard so it uses the "ompx"
namespace and an independent enum baseline of 200 (selected to not clash with
other known implementations).
The allocator is equivalent to using a custom allocator with the pinned
trait and the null fallback trait. One motivation for having this feature is
for use by the (planned) -foffload-memory=pinned feature.
gcc/fortran/ChangeLog:
* openmp.cc (is_predefined_allocator): Update valid ranges to
incorporate ompx_gnu_pinned_mem_alloc.
libgomp/ChangeLog:
* allocator.c (ompx_gnu_min_predefined_alloc): New.
(ompx_gnu_max_predefined_alloc): New.
(predefined_alloc_mapping): Rename to ...
(predefined_omp_alloc_mapping): ... this.
(predefined_ompx_gnu_alloc_mapping): New.
(_Static_assert): Adjust for the new name, and add a new assert for the
new table.
(predefined_allocator_p): New.
(predefined_alloc_mapping): New.
(omp_aligned_alloc): Support ompx_gnu_pinned_mem_alloc.
Use predefined_allocator_p and predefined_alloc_mapping.
(omp_free): Likewise.
(omp_alligned_calloc): Likewise.
(omp_realloc): Likewise.
* env.c (parse_allocator): Add ompx_gnu_pinned_mem_alloc.
* libgomp.texi: Document ompx_gnu_pinned_mem_alloc.
* omp.h.in (omp_allocator_handle_t): Add ompx_gnu_pinned_mem_alloc.
* omp_lib.f90.in: Add ompx_gnu_pinned_mem_alloc.
* omp_lib.h.in: Add ompx_gnu_pinned_mem_alloc.
* testsuite/libgomp.c/alloc-pinned-5.c: New test.
* testsuite/libgomp.c/alloc-pinned-6.c: New test.
* testsuite/libgomp.fortran/alloc-pinned-1.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-pinned-1.f90: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
|
|
The feature doesn't work on non-Linux hosts, at present, so skip the tests
entirely.
On Linux systems that have insufficient lockable memory configured we still
need to fail or else the feature won't be getting tested when we think it is,
but now there's a message to explain why.
libgomp/ChangeLog:
* testsuite/libgomp.c/alloc-pinned-1.c: Change dg-xfail-run-if to
dg-skip-if.
Correct spelling mistake.
Abort on insufficient lockable memory.
Use #error on non-linux hosts.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
|
|
|
|
libgomp/
* libgomp.texi (nvptx): Add missing preposition.
|
|
..., in order to enable (portions of) Fortran I/O, for example.
libgfortran/
* configure.ac: No longer set 'LIBGFOR_MINIMAL' for nvptx.
* configure: Regenerate.
libgomp/
* libgomp.texi (nvptx): Update.
* testsuite/libgomp.fortran/target-print-1-nvptx.f90: Remove.
* testsuite/libgomp.fortran/target-print-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/error_stop-2-nvptx.f: New.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Adjust.
* testsuite/libgomp.oacc-fortran/print-1-nvptx.f90: Adjust.
* testsuite/libgomp.oacc-fortran/print-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/stop-2-nvptx.f: New.
* testsuite/libgomp.oacc-fortran/stop-2.f: Adjust.
Co-authored-by: Andrew Stubbs <ams@gcc.gnu.org>
|
|
variable [PR97384, PR105274]
... as a means to manually set the "native" GPU thread stack size.
PR libgomp/97384
PR libgomp/105274
libgomp/
* plugin/cuda-lib.def (cuCtxSetLimit): Add.
* plugin/plugin-nvptx.c (nvptx_open_device): Handle
'GOMP_NVPTX_NATIVE_GPU_THREAD_STACK_SIZE' environment variable.
|
|
This extends commit d9c90c82d900fdae95df4499bf5f0a4ecb903b53
"nvptx target: Global constructor, destructor support, via nvptx-tools 'ld'"
for offloading.
libgcc/
* config/nvptx/gbl-ctors.c ["mgomp"]
(__do_global_ctors__entry__mgomp)
(__do_global_dtors__entry__mgomp): New.
[!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry):
New.
libgomp/
* plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New.
(nvptx_close_device, GOMP_OFFLOAD_load_image)
(GOMP_OFFLOAD_unload_image): Call it.
|
|
'abort' [GCC PR85463]"
PR target/85463
libgfortran/
* runtime/minimal.c [__nvptx__] (exit): Don't override.
libgomp/
* config/nvptx/error.c (exit): Don't override.
* testsuite/libgomp.oacc-fortran/error_stop-1.f: Update.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.
|
|
implementation status
The implementation has been committed in r15-1037.
2024-06-06 Jakub Jelinek <jakub@redhat.com>
* libgomp.texi (OpenMP 5.1 status): Mark Loop transformation constructs
as implemented.
|
|
|
|
This patch is largely rewritten version of the
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631764.html
patch set which I've promissed to adjust the way I'd like it but didn't
get to it until now.
The previous series together in diffstat was
176 files changed, 12107 insertions(+), 298 deletions(-)
This patch is
197 files changed, 10843 insertions(+), 212 deletions(-)
and diff between the old series and new patch is
268 files changed, 8053 insertions(+), 9231 deletions(-)
Only the 5.1/5.2 tile/unroll constructs are supported, in various
places some preparations for the other 6.0 loop transformations
constructs (interchange/reverse/fuse) are done, but certainly
not complete and not everywhere. The important difference is that
because tile/unroll partial map 1:1 the original loops to generated
canonical loops and add another set of generated loops without canonical
form inside of it, the tile/unroll partial constructs are terminal
for the generated loop, one can't have some loops from the tile or
unroll partial and some further loops from inside the body of that
construct.
The GENERIC representation attempts to match what the standard specifies,
so there are separate OMP_TILE and OMP_UNROLL trees. If for a particular
loop in a loop nest of some OpenMP loop it awaits a generated loop from a
nested loop, or if in OMP_LOOPXFORM_LOWERED OMP_TILE/UNROLL construct
a generated loop has been moved to some surrounding construct, that
particular loop is represented by all NULL_TREEs in the
OMP_FOR_{INIT,COND,INCR,ORIG_DECLS} vector.
The lowering of the loop transforming constructs is done at gimplification
time, at the start of gimplify_omp_for.
I think this way it is more maintainable over magic clauses with various
loop depths on the other looping constructs or the magic OMP_LOOP_TRANS
construct.
Though, I admit I'm still undecided how to represent the OpenMP 6.0
loop transformation case of say:
#pragma omp for collapse (4)
for (int i = 0; i < 32; ++i)
#pragma omp interchange permutation (2, 1)
#pragma omp reverse
for (int j = 0; j < 32; ++j)
#pragma omp reverse
for (int k = 0; k < 32; ++k)
for (int l = 0; l < 32; ++l)
;
Surely the i loop would go to first vector elements of OMP_FOR_*
of the work-sharing loop, then 2 loops are expecting generated loops
from interchange which would be inside of the body. But the innermost
l loop isn't part of the interchange, so the question is where to
put it. One possibility is to have it in the 4th loop of the OMP_FOR,
another possibility would be to add some artificial construct inside
of the OMP_INTERCHANGE and 2 OMP_REVERSE bodies which would contain
the inner loop(s), e.g. it could be OMP_INTERCHANGE without permutation
clause or some artificial ones or whatever.
I've recently raised various unclear things in the 5.1/5.2/TRs versions
regarding loop transformations, in particular
https://github.com/OpenMP/spec/issues/3908
https://github.com/OpenMP/spec/issues/3909
(sorry, private links unless you have OpenMP membership). Until those
are resolved, I have a sorry on trying to mix generated loops with
non-rectangular loops (way too many questions need to be answered before
that can be done) and similarly for mixing non-perfectly nested loops
with generated loops (again, it can be implemented somehow, but is way
too unclear). The second issue is mostly about data sharing, which is
ambiguous, the patch makes the artificial iterators of the loops effectively
private in the associated constructs (more like local), but for user
iterators doesn't do anything in particular, so for now one needs to use
explicit data sharing clauses on the non-loop transformation OpenMP looping
constructs or surrounding parallel/task/target etc.
2024-06-05 Jakub Jelinek <jakub@redhat.com>
Frederik Harwath <frederik@codesourcery.com>
Sandra Loosemore <sandra@codesourcery.com>
gcc/
* tree.def (OMP_TILE, OMP_UNROLL): New tree codes.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_PARTIAL,
OMP_CLAUSE_FULL and OMP_CLAUSE_SIZES.
* tree.h (OMP_LOOPXFORM_CHECK): Define.
(OMP_LOOPXFORM_LOWERED): Define.
(OMP_CLAUSE_PARTIAL_EXPR): Define.
(OMP_CLAUSE_SIZES_LIST): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add entries
for OMP_CLAUSE_{PARTIAL,FULL,SIZES}.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_{PARTIAL,FULL,SIZES}.
(dump_generic_node): Handle OMP_TILE and OMP_UNROLL. Skip printing
loops with NULL OMP_FOR_INIT (node) vector element.
* gimplify.cc (is_gimple_stmt): Handle OMP_TILE and OMP_UNROLL.
(gimplify_omp_taskloop_expr): For SAVE_EXPR use gimplify_save_expr.
(gimplify_omp_loop_xform): New function.
(gimplify_omp_for): Call omp_maybe_apply_loop_xforms and if that
reshuffles what the passed pointer points to, retry or return GS_OK.
Handle OMP_TILE and OMP_UNROLL.
(gimplify_omp_loop): Call omp_maybe_apply_loop_xforms and if that
reshuffles what the passed pointer points to, return GS_OK.
(gimplify_expr): Handle OMP_TILE and OMP_UNROLL.
* omp-general.h (omp_loop_number_of_iterations,
omp_maybe_apply_loop_xforms): Declare.
* omp-general.cc (omp_adjust_for_condition): For LE_EXPR and GE_EXPR
with pointers, don't add/subtract one, but the size of what the
pointer points to.
(omp_loop_number_of_iterations, omp_apply_tile,
find_nested_loop_xform, omp_maybe_apply_loop_xforms): New functions.
gcc/c-family/
* c-common.h (c_omp_find_generated_loop): Declare.
* c-gimplify.cc (c_genericize_control_stmt): Handle OMP_TILE and
OMP_UNROLL.
* c-omp.cc (c_finish_omp_for): Handle generated loops.
(c_omp_is_loop_iterator): Likewise.
(c_find_nested_loop_xform_r, c_omp_find_generated_loop): New
functions.
(c_omp_check_loop_iv): Handle generated loops. For now sorry
on mixing non-rectangular loop with generated loops.
(c_omp_check_loop_binding_exprs): For now sorry on mixing
imperfect loops with generated loops.
(c_omp_directives): Uncomment tile and unroll entries.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE and
PRAGMA_OMP_UNROLL, change PRAGMA_OMP__LAST_ to the latter.
(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FULL and
PRAGMA_OMP_CLAUSE_PARTIAL.
* c-pragma.cc (omp_pragmas_simd): Add tile and unroll omp pragmas.
gcc/c/
* c-parser.cc (c_parser_skip_std_attribute_spec_seq): New function.
(check_omp_intervening_code): Reject imperfectly nested tile.
(c_parser_compound_statement_nostart): If want_nested_loop, use
c_parser_omp_next_tokens_can_be_canon_loop instead of just checking
for RID_FOR keyword.
(c_parser_omp_clause_name): Handle full and partial clause names.
(c_parser_omp_clause_allocate): Remove spurious semicolon.
(c_parser_omp_clause_full, c_parser_omp_clause_partial): New
functions.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FULL and
PRAGMA_OMP_CLAUSE_PARTIAL.
(c_parser_omp_next_tokens_can_be_canon_loop): New function.
(c_parser_omp_loop_nest): Parse C23 attributes. Handle tile/unroll
constructs. Use c_parser_omp_next_tokens_can_be_canon_loop instead
of just checking for RID_FOR keyword. Only add_stmt (body) if it is
non-NULL.
(c_parser_omp_for_loop): Rename tiling variable to oacc_tiling. For
OMP_CLAUSE_SIZES set collapse to list length of OMP_CLAUSE_SIZES_LIST.
Use c_parser_omp_next_tokens_can_be_canon_loop instead of just
checking for RID_FOR keyword. Remove spurious semicolon. Don't call
c_omp_check_loop_binding_exprs if stmt is NULL. Skip generated loops.
(c_parser_omp_tile_sizes, c_parser_omp_tile): New functions.
(OMP_UNROLL_CLAUSE_MASK): Define.
(c_parser_omp_unroll): New function.
(c_parser_omp_construct): Handle PRAGMA_OMP_TILE and
PRAGMA_OMP_UNROLL.
* c-typeck.cc (c_finish_omp_clauses): Adjust wording of some of the
conflicting clause diagnostic messages to include word clause.
Handle OMP_CLAUSE_{FULL,PARTIAL,SIZES} and diagnose full vs. partial
conflict.
gcc/cp/
* cp-tree.h (dependent_omp_for_p): Add another tree argument.
* parser.cc (check_omp_intervening_code): Reject imperfectly nested
tile.
(cp_parser_statement_seq_opt): If want_nested_loop, use
cp_parser_next_tokens_can_be_canon_loop instead of just checking
for RID_FOR keyword.
(cp_parser_omp_clause_name): Handle full and partial clause names.
(cp_parser_omp_clause_full, cp_parser_omp_clause_partial): New
functions.
(cp_parser_omp_all_clauses): Formatting fix. Handle
PRAGMA_OMP_CLAUSE_PARTIAL and PRAGMA_OMP_CLAUSE_FULL.
(cp_parser_next_tokens_can_be_canon_loop): New function.
(cp_parser_omp_loop_nest): Parse C++11 attributes. Handle tile/unroll
constructs. Use cp_parser_next_tokens_can_be_canon_loop instead
of just checking for RID_FOR keyword. Only add_stmt
cp_parser_omp_loop_nest result if it is non-NULL.
(cp_parser_omp_for_loop): Rename tiling variable to oacc_tiling. For
OMP_CLAUSE_SIZES set collapse to list length of OMP_CLAUSE_SIZES_LIST.
Use cp_parser_next_tokens_can_be_canon_loop instead of just
checking for RID_FOR keyword. Remove spurious semicolon. Don't call
c_omp_check_loop_binding_exprs if stmt is NULL. Skip and/or handle
generated loops. Remove spurious ()s around & operands.
(cp_parser_omp_tile_sizes, cp_parser_omp_tile): New functions.
(OMP_UNROLL_CLAUSE_MASK): Define.
(cp_parser_omp_unroll): New function.
(cp_parser_omp_construct): Handle PRAGMA_OMP_TILE and
PRAGMA_OMP_UNROLL.
(cp_parser_pragma): Likewise.
* semantics.cc (finish_omp_clauses): Don't call
fold_build_cleanup_point_expr for cases which obviously won't need it,
like checked INTEGER_CSTs. Handle OMP_CLAUSE_{FULL,PARTIAL,SIZES}
and diagnose full vs. partial conflict. Adjust wording of some of the
conflicting clause diagnostic messages to include word clause.
(finish_omp_for): Use decl equal to global_namespace as a marker for
generated loop. Pass also body to dependent_omp_for_p. Skip
generated loops.
(finish_omp_for_block): Skip generated loops.
* pt.cc (tsubst_omp_clauses): Handle OMP_CLAUSE_{FULL,PARTIAL,SIZES}.
(tsubst_stmt): Handle OMP_TILE and OMP_UNROLL. Handle or skip
generated loops.
(dependent_omp_for_p): Add body argument. If declv vector element
is NULL, find generated loop.
* cp-gimplify.cc (cp_gimplify_expr): Handle OMP_TILE and OMP_UNROLL.
(cp_fold_r): Likewise.
(cp_genericize_r): Likewise. Skip generated loops.
gcc/fortran/
* gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL,
ST_OMP_END_UNROLL, ST_OMP_TILE and ST_OMP_END_TILE.
(struct gfc_omp_clauses): Add sizes_list, partial, full and erroneous
members.
(enum gfc_exec_op): Add EXEC_OMP_UNROLL and EXEC_OMP_TILE.
(gfc_expr_list_len): Declare.
* match.h (gfc_match_omp_tile, gfc_match_omp_unroll): Declare.
* openmp.cc (gfc_get_location): Declare.
(gfc_free_omp_clauses): Free sizes_list.
(match_oacc_expr_list): Rename to ...
(match_omp_oacc_expr_list): ... this. Add is_omp argument and
change diagnostic wording if it is true.
(enum omp_mask2): Add OMP_CLAUSE_{FULL,PARTIAL,SIZES}.
(gfc_match_omp_clauses): Parse full, partial and sizes clauses.
(gfc_match_oacc_wait): Use match_omp_oacc_expr_list instead of
match_oacc_expr_list.
(OMP_UNROLL_CLAUSES, OMP_TILE_CLAUSES): Define.
(gfc_match_omp_tile, gfc_match_omp_unroll): New functions.
(resolve_omp_clauses): Diagnose full vs. partial clause conflict.
Resolve sizes clause arguments.
(find_nested_loop_in_chain): Use switch instead of series of ifs.
Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL.
(gfc_resolve_omp_do_blocks): Set omp_current_do_collapse to
list length of sizes_list if present.
(gfc_resolve_do_iterator): Return for EXEC_OMP_TILE or
EXEC_OMP_UNROLL.
(restructure_intervening_code): Remove spurious ()s around & operands.
(is_outer_iteration_variable): Handle EXEC_OMP_TILE and
EXEC_OMP_UNROLL.
(check_nested_loop_in_chain): Likewise.
(expr_is_invariant): Likewise.
(resolve_omp_do): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL. Diagnose
tile without sizes clause. Use sizes_list length for count if
non-NULL. Set code->ext.omp_clauses->erroneous on loops where we've
reported diagnostics. Sorry for mixing non-rectangular loops with
generated loops.
(omp_code_to_statement): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Parse end tile, end unroll, tile
and unroll. Move nothing entry alphabetically.
(case_exec_markers): Add ST_OMP_TILE and ST_OMP_UNROLL.
(gfc_ascii_statement): Handle ST_OMP_END_TILE, ST_OMP_END_UNROLL,
ST_OMP_TILE and ST_OMP_UNROLL.
(parse_omp_do): Add nested argument. Handle ST_OMP_TILE and
ST_OMP_UNROLL.
(parse_omp_structured_block): Adjust parse_omp_do caller.
(parse_executable): Likewise. Handle ST_OMP_TILE and ST_OMP_UNROLL.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE and
EXEC_OMP_UNROLL.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans.cc (trans_code): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle full, partial and
sizes clauses. Use tree_cons + nreverse instead of
temporary vector and build_tree_list_vec for tile_list handling.
(gfc_expr_list_len): New function.
(gfc_trans_omp_do): Rename tile to oacc_tile. Handle sizes clause.
Don't assert code->op is EXEC_DO. Handle EXEC_OMP_TILE and
EXEC_OMP_UNROLL.
(gfc_trans_omp_directive): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL.
* dump-parse-tree.cc (show_omp_clauses): Dump full, partial and
sizes clauses.
(show_omp_node): Handle EXEC_OMP_TILE and EXEC_OMP_UNROLL.
(show_code_node): Likewise.
gcc/testsuite/
* c-c++-common/gomp/attrs-tile-1.c: New test.
* c-c++-common/gomp/attrs-tile-2.c: New test.
* c-c++-common/gomp/attrs-tile-3.c: New test.
* c-c++-common/gomp/attrs-tile-4.c: New test.
* c-c++-common/gomp/attrs-tile-5.c: New test.
* c-c++-common/gomp/attrs-tile-6.c: New test.
* c-c++-common/gomp/attrs-unroll-1.c: New test.
* c-c++-common/gomp/attrs-unroll-2.c: New test.
* c-c++-common/gomp/attrs-unroll-3.c: New test.
* c-c++-common/gomp/attrs-unroll-inner-1.c: New test.
* c-c++-common/gomp/attrs-unroll-inner-2.c: New test.
* c-c++-common/gomp/attrs-unroll-inner-3.c: New test.
* c-c++-common/gomp/attrs-unroll-inner-4.c: New test.
* c-c++-common/gomp/attrs-unroll-inner-5.c: New test.
* c-c++-common/gomp/imperfect-attributes.c: Adjust expected
diagnostics.
* c-c++-common/gomp/imperfect-loop-nest.c: New test.
* c-c++-common/gomp/ordered-5.c: New test.
* c-c++-common/gomp/scan-7.c: New test.
* c-c++-common/gomp/tile-1.c: New test.
* c-c++-common/gomp/tile-2.c: New test.
* c-c++-common/gomp/tile-3.c: New test.
* c-c++-common/gomp/tile-4.c: New test.
* c-c++-common/gomp/tile-5.c: New test.
* c-c++-common/gomp/tile-6.c: New test.
* c-c++-common/gomp/tile-7.c: New test.
* c-c++-common/gomp/tile-8.c: New test.
* c-c++-common/gomp/tile-9.c: New test.
* c-c++-common/gomp/tile-10.c: New test.
* c-c++-common/gomp/tile-11.c: New test.
* c-c++-common/gomp/tile-12.c: New test.
* c-c++-common/gomp/tile-13.c: New test.
* c-c++-common/gomp/tile-14.c: New test.
* c-c++-common/gomp/tile-15.c: New test.
* c-c++-common/gomp/unroll-1.c: New test.
* c-c++-common/gomp/unroll-2.c: New test.
* c-c++-common/gomp/unroll-3.c: New test.
* c-c++-common/gomp/unroll-4.c: New test.
* c-c++-common/gomp/unroll-5.c: New test.
* c-c++-common/gomp/unroll-6.c: New test.
* c-c++-common/gomp/unroll-7.c: New test.
* c-c++-common/gomp/unroll-8.c: New test.
* c-c++-common/gomp/unroll-9.c: New test.
* c-c++-common/gomp/unroll-inner-1.c: New test.
* c-c++-common/gomp/unroll-inner-2.c: New test.
* c-c++-common/gomp/unroll-inner-3.c: New test.
* c-c++-common/gomp/unroll-non-rect-1.c: New test.
* c-c++-common/gomp/unroll-non-rect-2.c: New test.
* c-c++-common/gomp/unroll-non-rect-3.c: New test.
* c-c++-common/gomp/unroll-simd-1.c: New test.
* gcc.dg/gomp/attrs-4.c: Adjust expected diagnostics.
* gcc.dg/gomp/for-1.c: Likewise.
* gcc.dg/gomp/for-11.c: Likewise.
* g++.dg/gomp/attrs-4.C: Likewise.
* g++.dg/gomp/for-1.C: Likewise.
* g++.dg/gomp/pr94512.C: Likewise.
* g++.dg/gomp/tile-1.C: New test.
* g++.dg/gomp/tile-2.C: New test.
* g++.dg/gomp/unroll-1.C: New test.
* g++.dg/gomp/unroll-2.C: New test.
* g++.dg/gomp/unroll-3.C: New test.
* gfortran.dg/gomp/inner-loops-1.f90: New test.
* gfortran.dg/gomp/inner-loops-2.f90: New test.
* gfortran.dg/gomp/pure-1.f90: Add tests for !$omp unroll
and !$omp tile.
* gfortran.dg/gomp/pure-2.f90: Remove those tests from here.
* gfortran.dg/gomp/scan-9.f90: New test.
* gfortran.dg/gomp/tile-1.f90: New test.
* gfortran.dg/gomp/tile-2.f90: New test.
* gfortran.dg/gomp/tile-3.f90: New test.
* gfortran.dg/gomp/tile-4.f90: New test.
* gfortran.dg/gomp/tile-5.f90: New test.
* gfortran.dg/gomp/tile-6.f90: New test.
* gfortran.dg/gomp/tile-7.f90: New test.
* gfortran.dg/gomp/tile-8.f90: New test.
* gfortran.dg/gomp/tile-9.f90: New test.
* gfortran.dg/gomp/tile-10.f90: New test.
* gfortran.dg/gomp/tile-imperfect-nest-1.f90: New test.
* gfortran.dg/gomp/tile-imperfect-nest-2.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-1.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-2.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-3.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-4.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-5.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-6.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-7.f90: New test.
* gfortran.dg/gomp/tile-inner-loops-8.f90: New test.
* gfortran.dg/gomp/tile-non-rectangular-1.f90: New test.
* gfortran.dg/gomp/tile-non-rectangular-2.f90: New test.
* gfortran.dg/gomp/tile-non-rectangular-3.f90: New test.
* gfortran.dg/gomp/tile-unroll-1.f90: New test.
* gfortran.dg/gomp/tile-unroll-2.f90: New test.
* gfortran.dg/gomp/unroll-1.f90: New test.
* gfortran.dg/gomp/unroll-2.f90: New test.
* gfortran.dg/gomp/unroll-3.f90: New test.
* gfortran.dg/gomp/unroll-4.f90: New test.
* gfortran.dg/gomp/unroll-5.f90: New test.
* gfortran.dg/gomp/unroll-6.f90: New test.
* gfortran.dg/gomp/unroll-7.f90: New test.
* gfortran.dg/gomp/unroll-8.f90: New test.
* gfortran.dg/gomp/unroll-9.f90: New test.
* gfortran.dg/gomp/unroll-10.f90: New test.
* gfortran.dg/gomp/unroll-11.f90: New test.
* gfortran.dg/gomp/unroll-12.f90: New test.
* gfortran.dg/gomp/unroll-13.f90: New test.
* gfortran.dg/gomp/unroll-inner-loop-1.f90: New test.
* gfortran.dg/gomp/unroll-inner-loop-2.f90: New test.
* gfortran.dg/gomp/unroll-no-clause-1.f90: New test.
* gfortran.dg/gomp/unroll-non-rect-1.f90: New test.
* gfortran.dg/gomp/unroll-non-rect-2.f90: New test.
* gfortran.dg/gomp/unroll-simd-1.f90: New test.
* gfortran.dg/gomp/unroll-simd-2.f90: New test.
* gfortran.dg/gomp/unroll-simd-3.f90: New test.
* gfortran.dg/gomp/unroll-tile-1.f90: New test.
* gfortran.dg/gomp/unroll-tile-2.f90: New test.
* gfortran.dg/gomp/unroll-tile-inner-1.f90: New test.
libgomp/
* testsuite/libgomp.c-c++-common/imperfect-transform-1.c: New test.
* testsuite/libgomp.c-c++-common/imperfect-transform-2.c: New test.
* testsuite/libgomp.c-c++-common/matrix-1.h: New test.
* testsuite/libgomp.c-c++-common/matrix-constant-iter.h: New test.
* testsuite/libgomp.c-c++-common/matrix-helper.h: New test.
* testsuite/libgomp.c-c++-common/matrix-no-directive-1.c: New test.
* testsuite/libgomp.c-c++-common/matrix-no-directive-unroll-full-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-for-1.c: New test.
* testsuite/libgomp.c-c++-common/matrix-omp-parallel-for-1.c: New test.
* testsuite/libgomp.c-c++-common/matrix-omp-parallel-masked-taskloop-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-parallel-masked-taskloop-simd-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-target-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-target-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-omp-taskloop-1.c: New test.
* testsuite/libgomp.c-c++-common/matrix-omp-teams-distribute-parallel-for-1.c:
New test.
* testsuite/libgomp.c-c++-common/matrix-simd-1.c: New test.
* testsuite/libgomp.c-c++-common/matrix-transform-variants-1.h:
New test.
* testsuite/libgomp.c-c++-common/target-imperfect-transform-1.c:
New test.
* testsuite/libgomp.c-c++-common/target-imperfect-transform-2.c:
New test.
* testsuite/libgomp.c-c++-common/unroll-1.c: New test.
* testsuite/libgomp.c-c++-common/unroll-non-rect-1.c: New test.
* testsuite/libgomp.c++/matrix-no-directive-unroll-full-1.C: New test.
* testsuite/libgomp.c++/tile-2.C: New test.
* testsuite/libgomp.c++/tile-3.C: New test.
* testsuite/libgomp.c++/unroll-1.C: New test.
* testsuite/libgomp.c++/unroll-2.C: New test.
* testsuite/libgomp.c++/unroll-full-tile.C: New test.
* testsuite/libgomp.fortran/imperfect-transform-1.f90: New test.
* testsuite/libgomp.fortran/imperfect-transform-2.f90: New test.
* testsuite/libgomp.fortran/inner-1.f90: New test.
* testsuite/libgomp.fortran/nested-fn.f90: New test.
* testsuite/libgomp.fortran/target-imperfect-transform-1.f90: New test.
* testsuite/libgomp.fortran/target-imperfect-transform-2.f90: New test.
* testsuite/libgomp.fortran/tile-1.f90: New test.
* testsuite/libgomp.fortran/tile-2.f90: New test.
* testsuite/libgomp.fortran/tile-unroll-1.f90: New test.
* testsuite/libgomp.fortran/tile-unroll-2.f90: New test.
* testsuite/libgomp.fortran/tile-unroll-3.f90: New test.
* testsuite/libgomp.fortran/tile-unroll-4.f90: New test.
* testsuite/libgomp.fortran/unroll-1.f90: New test.
* testsuite/libgomp.fortran/unroll-2.f90: New test.
* testsuite/libgomp.fortran/unroll-3.f90: New test.
* testsuite/libgomp.fortran/unroll-4.f90: New test.
* testsuite/libgomp.fortran/unroll-5.f90: New test.
* testsuite/libgomp.fortran/unroll-6.f90: New test.
* testsuite/libgomp.fortran/unroll-7a.f90: New test.
* testsuite/libgomp.fortran/unroll-7b.f90: New test.
* testsuite/libgomp.fortran/unroll-7c.f90: New test.
* testsuite/libgomp.fortran/unroll-7.f90: New test.
* testsuite/libgomp.fortran/unroll-8.f90: New test.
* testsuite/libgomp.fortran/unroll-simd-1.f90: New test.
* testsuite/libgomp.fortran/unroll-tile-1.f90: New test.
* testsuite/libgomp.fortran/unroll-tile-2.f90: New test.
|
|
|
|
A recent patch
commit bdc264a16e327c63d133131a695a202fbbc0a6a0
Author: Alexandre Oliva <oliva@adacore.com>
Date: Thu May 30 02:06:48 2024 -0300
[testsuite] conditionalize dg-additional-sources on target and type
added two additional args to dg-additional-files-options.
Unfortunately, this completely broke several testsuites like
ERROR: tcl error sourcing /vol/gcc/src/hg/master/local/libatomic/testsuite/../../gcc/testsuite/lib/gcc-dg.exp.
wrong # args: should be "dg-additional-files-options options source dest type"
since the patch forgot to adjust some of the callers.
This patch fixes that.
Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.
2024-05-31 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libatomic:
PR testsuite/115294
* testsuite/lib/libatomic.exp (libatomic_target_compile): Pass new
dg-additional-files-options args.
libgomp:
PR testsuite/115294
* testsuite/lib/libgomp.exp (libgomp_target_compile): Pass new
dg-additional-files-options args.
libitm:
PR testsuite/115294
* testsuite/lib/libitm.exp (libitm_target_compile): Pass new
dg-additional-files-options args.
libphobos:
PR testsuite/115294
* testsuite/lib/libphobos.exp (libphobos_target_compile): Pass new
dg-additional-files-options args.
libvtv:
PR testsuite/115294
* testsuite/lib/libvtv.exp (libvtv_target_compile): Pass new
dg-additional-files-options args.
|
|
|
|
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.0 status): Mark 'requires' as done and
link to 'Offload-Target Specifics'.
(OpenMP 5.2 status): Add item about additional map-type modifiers
in 'declare mapper'.
|
|
|
|
If HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true,
all GPUs on the system support unified shared memory. That's
the case for APUs and MI200 devices when XNACK is enabled.
XNACK can be enabled by setting HSA_XNACK=1 as env var for
supported devices; otherwise, if disable, USM code will
use host fallback.
gcc/ChangeLog:
* config/gcn/gcn-hsa.h (gcn_local_sym_hash): Fix typo.
include/ChangeLog:
* hsa.h (HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT): Add
enum value.
libgomp/ChangeLog:
* libgomp.texi (gcn): Update USM handling
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Handle
USM if HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true.
|
|
A few high-end nvptx devices support the attribute
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared
memory is supported in hardware. This patch enables support for those -
if all installed nvptx devices have this feature (as the capabilities
are per device type).
This exposes a bug in gomp_copy_back_icvs as it did before use
omp_get_mapped_ptr to find mapped variables, but that returns
the unchanged pointer in cased of shared memory. But in this case,
we have a few actually mapped pointers - like the ICV variables.
Additionally, there was a mismatch with regards to '-1' for the
device number as gomp_copy_back_icvs and omp_get_mapped_ptr count
differently. Hence, do the lookup manually.
include/ChangeLog:
* cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add.
libgomp/ChangeLog:
* libgomp.texi (nvptx): Update USM description.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices):
Claim support when requesting USM and all devices support
CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS.
* target.c (gomp_copy_back_icvs): Fix device ptr lookup.
(gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the
devices supports USM.
|