Age | Commit message (Collapse) | Author | Files | Lines |
|
... after its deprecation in GCC 12.
* Makefile.def: Remove module 'liboffloadmic'.
* Makefile.in: Regenerate.
* configure.ac: Remove 'liboffloadmic' handling.
* configure: Regenerate.
contrib/
* gcc-changelog/git_commit.py (default_changelog_locations):
Remove 'liboffloadmic'.
* gcc_update (files_and_dependencies): Remove 'liboffloadmic'
files.
* update-copyright.py (GCCCmdLine): Remove 'liboffloadmic'
comment.
gcc/
* config.gcc [target *-intelmic-* | *-intelmicemul-*]: Remove.
* config/i386/i386-options.cc (ix86_omp_device_kind_arch_isa)
[ACCEL_COMPILER]: Remove.
* config/i386/intelmic-mkoffload.cc: Remove.
* config/i386/intelmic-offload.h: Likewise.
* config/i386/t-intelmic: Likewise.
* config/i386/t-omp-device: Likewise.
* configure.ac [target *-intelmic-* | *-intelmicemul-*]: Remove.
* configure: Regenerate.
* doc/install.texi (--enable-offload-targets=[...]): Update.
* doc/sourcebuild.texi: Remove 'liboffloadmic' documentation.
include/
* gomp-constants.h (GOMP_DEVICE_INTEL_MIC): Comment out.
(GOMP_VERSION_INTEL_MIC): Remove.
libgomp/
* libgomp-plugin.h (OFFLOAD_TARGET_TYPE_INTEL_MIC): Remove.
* libgomp.texi (OpenMP Context Selectors): Remove Intel MIC
documentation.
* plugin/configfrag.ac <enable_offload_targets>
[*-intelmic-* | *-intelmicemul-*]: Remove.
* configure: Regenerate.
* testsuite/lib/libgomp.exp (libgomp_init): Remove 'liboffloadmic'
handling.
(offload_target_to_openacc_device_type)
[$offload_target = *-intelmic*]: Remove.
(check_effective_target_offload_device_intel_mic)
(check_effective_target_offload_device_any_intel_mic): Remove.
* testsuite/libgomp.c-c++-common/on_device_arch.h
(device_arch_intel_mic, on_device_arch_intel_mic, any_device_arch)
(any_device_arch_intel_mic): Remove.
* testsuite/libgomp.c-c++-common/target-45.c: Remove
'offload_device_any_intel_mic' XFAIL.
* testsuite/libgomp.fortran/target10.f90: Likewise.
liboffloadmic/
* ChangeLog: Remove.
* Makefile.am: Likewise.
* Makefile.in: Likewise.
* aclocal.m4: Likewise.
* configure: Likewise.
* configure.ac: Likewise.
* configure.tgt: Likewise.
* doc/doxygen/config: Likewise.
* doc/doxygen/header.tex: Likewise.
* include/coi/common/COIEngine_common.h: Likewise.
* include/coi/common/COIEvent_common.h: Likewise.
* include/coi/common/COIMacros_common.h: Likewise.
* include/coi/common/COIPerf_common.h: Likewise.
* include/coi/common/COIResult_common.h: Likewise.
* include/coi/common/COISysInfo_common.h: Likewise.
* include/coi/common/COITypes_common.h: Likewise.
* include/coi/sink/COIBuffer_sink.h: Likewise.
* include/coi/sink/COIPipeline_sink.h: Likewise.
* include/coi/sink/COIProcess_sink.h: Likewise.
* include/coi/source/COIBuffer_source.h: Likewise.
* include/coi/source/COIEngine_source.h: Likewise.
* include/coi/source/COIEvent_source.h: Likewise.
* include/coi/source/COIPipeline_source.h: Likewise.
* include/coi/source/COIProcess_source.h: Likewise.
* liboffloadmic_host.spec.in: Likewise.
* liboffloadmic_target.spec.in: Likewise.
* plugin/Makefile.am: Likewise.
* plugin/Makefile.in: Likewise.
* plugin/aclocal.m4: Likewise.
* plugin/configure: Likewise.
* plugin/configure.ac: Likewise.
* plugin/libgomp-plugin-intelmic.cpp: Likewise.
* plugin/offload_target_main.cpp: Likewise.
* runtime/cean_util.cpp: Likewise.
* runtime/cean_util.h: Likewise.
* runtime/coi/coi_client.cpp: Likewise.
* runtime/coi/coi_client.h: Likewise.
* runtime/coi/coi_server.cpp: Likewise.
* runtime/coi/coi_server.h: Likewise.
* runtime/compiler_if_host.cpp: Likewise.
* runtime/compiler_if_host.h: Likewise.
* runtime/compiler_if_target.cpp: Likewise.
* runtime/compiler_if_target.h: Likewise.
* runtime/dv_util.cpp: Likewise.
* runtime/dv_util.h: Likewise.
* runtime/emulator/coi_common.h: Likewise.
* runtime/emulator/coi_device.cpp: Likewise.
* runtime/emulator/coi_device.h: Likewise.
* runtime/emulator/coi_host.cpp: Likewise.
* runtime/emulator/coi_host.h: Likewise.
* runtime/emulator/coi_version_asm.h: Likewise.
* runtime/emulator/coi_version_linker_script.map: Likewise.
* runtime/liboffload_error.c: Likewise.
* runtime/liboffload_error_codes.h: Likewise.
* runtime/liboffload_msg.c: Likewise.
* runtime/liboffload_msg.h: Likewise.
* runtime/mic_lib.f90: Likewise.
* runtime/offload.h: Likewise.
* runtime/offload_common.cpp: Likewise.
* runtime/offload_common.h: Likewise.
* runtime/offload_engine.cpp: Likewise.
* runtime/offload_engine.h: Likewise.
* runtime/offload_env.cpp: Likewise.
* runtime/offload_env.h: Likewise.
* runtime/offload_host.cpp: Likewise.
* runtime/offload_host.h: Likewise.
* runtime/offload_iterator.h: Likewise.
* runtime/offload_omp_host.cpp: Likewise.
* runtime/offload_omp_target.cpp: Likewise.
* runtime/offload_orsl.cpp: Likewise.
* runtime/offload_orsl.h: Likewise.
* runtime/offload_table.cpp: Likewise.
* runtime/offload_table.h: Likewise.
* runtime/offload_target.cpp: Likewise.
* runtime/offload_target.h: Likewise.
* runtime/offload_target_main.cpp: Likewise.
* runtime/offload_timer.h: Likewise.
* runtime/offload_timer_host.cpp: Likewise.
* runtime/offload_timer_target.cpp: Likewise.
* runtime/offload_trace.cpp: Likewise.
* runtime/offload_trace.h: Likewise.
* runtime/offload_util.cpp: Likewise.
* runtime/offload_util.h: Likewise.
* runtime/ofldbegin.cpp: Likewise.
* runtime/ofldend.cpp: Likewise.
* runtime/orsl-lite/include/orsl-lite.h: Likewise.
* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
* runtime/orsl-lite/version.txt: Likewise.
|
|
OpenMP 5.0 permits to use arrays with derived type components for the list
items to the 'from'/'to' clauses of the 'target update' directive.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_clauses): Permit derived types for
the 'to' and 'from' clauses of 'target update'.
* trans-openmp.cc (gfc_trans_omp_clauses): Fixes for
derived-type changes; fix size for scalars.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-11.f90: New test.
* testsuite/libgomp.fortran/target-13.f90: New test.
|
|
[PR106643, PR96668]
PR libgomp/106643
PR fortran/96668
libgomp/
* oacc-mem.c (goacc_enter_data_internal): Support
OpenACC 'declare create' with Fortran allocatable arrays, part II.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/pr106643-1.f90: New.
|
|
[PR106643]
PR libgomp/106643
libgomp/
* oacc-mem.c (goacc_enter_data_internal): Support
OpenACC 'declare create' with Fortran allocatable arrays, part I.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
New.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
New.
|
|
libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
New.
|
|
... which is 'libgomp.oacc-fortran/declare-allocatable-1.f90' adjusted
for missing support for OpenACC "Changes from Version 2.0 to 2.5":
"The 'declare create' directive with a Fortran 'allocatable' has new behavior".
Thus, after 'allocate'/before 'deallocate', call 'acc_create'/'acc_delete'
manually.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
New.
|
|
libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90: New.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
This patch prevents compiler-generated artificial variables from being
treated as privatization candidates for OpenACC.
The rationale is that e.g. "gang-private" variables actually must be
shared by each worker and vector spawned within a particular gang, but
that sharing is not necessary for any compiler-generated variable (at
least at present, but no such need is anticipated either). Variables on
the stack (and machine registers) are already private per-"thread"
(gang, worker and/or vector), and that's fine for artificial variables.
We're restricting this to blocks, as we still need to understand what it
means for a 'DECL_ARTIFICIAL' to appear in a 'private' clause.
Several tests need their scan output patterns adjusted to compensate.
2022-10-14 Julian Brown <julian@codesourcery.com>
PR middle-end/90115
gcc/
* omp-low.cc (oacc_privatization_candidate_p): Artificial vars are not
privatization candidates.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-1.f90: Adjust scan output.
* testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise.
* testsuite/libgomp.oacc-fortran/if-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/print-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
[PR107195, PR107344]
That is, adjust for optimization introduced with recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0", where GCC now
understands that after 'r *= 2;', 'r & 1' will never hold here, and thus
transforms/optimizes/"disturbs" the original code such that GCC/nvptx's later
"Neuter whole SESE regions" optimization no longer is applicable to it:
UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
Same for C++.
It's unclear to me if this is an actual "problem", which optimization is "more
important", so I've filed PR107344 "GCC/nvptx SESE region optimization" to
capture this question, and here restore what we intend to be testing (to my
understanding) in 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.
PR tree-optimization/107195
PR target/107344
libgomp/
* testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c: Restore SESE
regions checking.
|
|
Duplicate libgomp.c-c++-common/requires-4.c (as ...-4a.c) but
with using a heap-allocated instead of static memory for a variable.
This change and the added offload_device_gcn check prepare for
pseudo-USM, where the device hardware cannot access all host
memory but only managed and pinned memory; for those, requires-4.c
will fail and the new check permits to add
target { ! { offload_device_nvptx || offload_device_gcn } }
to requires-4.c; however, it has not been added yet as pseuo-USM
support is not yet on mainline. (Review is pending for the USM
patches.)
include/ChangeLog:
* gomp-constants.h (GOMP_DEVICE_HSA): Comment out unused define.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn):
New.
* testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn,
on_device_arch_gcn): New.
* testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from
requires-4.c but using heap-allocated memory.
|
|
After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]",
"big" private data now works for GCN offloading, too.
PR target/105421
libgomp/
* testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.
|
|
'libgomp.c/reverse-offload-sm30.c'
That is, '-mptx=_' is only valid in '-foffload-options=nvptx-none', too.
Fix test case added in recent
commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 "nvptx/mkoffload.cc:
Warn instead of error when reverse offload is not possible".
libgomp/
* testsuite/libgomp.c/reverse-offload-sm30.c: Fix nvptx-specific
'-foffload-options' syntax.
|
|
Fortranized testcases of commits r13-3257-ga58a965eb73
and r13-3258-g0ec4e93fb9f.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/task-7.f90: New test.
* testsuite/libgomp.fortran/task-8.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-1.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-2.f90: New test.
* testsuite/libgomp.fortran/task-in-explicit-3.f90: New test.
* testsuite/libgomp.fortran/task-reduction-17.f90: New test.
* testsuite/libgomp.fortran/task-reduction-18.f90: New test.
|
|
This is pretty straightforward, if gomp_thread ()->task is NULL,
it can't be explicit task, otherwise if
gomp_thread ()->task->kind == GOMP_TASK_IMPLICIT, it is an implicit
task, otherwise explicit task.
2022-10-12 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_in_explicit_task): Declare.
* omp_lib.h.in (omp_in_explicit_task): Likewise.
* omp_lib.f90.in (omp_in_explicit_task): New interface.
* libgomp.map (OMP_5.2): New symbol version, export
omp_in_explicit_task and omp_in_explicit_task_.
* task.c (omp_in_explicit_task): New function.
* fortran.c (omp_in_explicit_task): Add ialias_redirect.
(omp_in_explicit_task_): New function.
* libgomp.texi (OpenMP 5.2): Mark omp_in_explicit_task as implemented.
* testsuite/libgomp.c-c++-common/task-in-explicit-1.c: New test.
* testsuite/libgomp.c-c++-common/task-in-explicit-2.c: New test.
* testsuite/libgomp.c-c++-common/task-in-explicit-3.c: New test.
|
|
When not in explicit parallel/target/teams construct, we in some cases create
an artificial parallel with a single thread (either to handle target nowait
or for task reduction purposes). In those cases, it handled again artificially
created implicit task (created by gomp_new_icv for cases where we needed to write
to some ICVs), but as the testcases show, didn't take into account possibility
of this being done from explicit task(s). The code would destroy/free the previous
task and replace it with the new implicit task. If task is an explicit task
(when teams is NULL, all explicit tasks behave like if (0)), it is a pointer to
a local stack variable, so freeing it doesn't work, and additionally we shouldn't
lose the explicit tasks - the new implicit task should instead replace the
ancestor task which is the first implicit one.
2022-10-12 Jakub Jelinek <jakub@redhat.com>
* task.c (gomp_create_artificial_team): Fix up handling of invocations
from within explicit task.
* target.c (GOMP_target_ext): Likewise.
* testsuite/libgomp.c/task-7.c: New test.
* testsuite/libgomp.c/task-8.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-17.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-18.c: New test.
|
|
OpenMP 5.1 added has_device_addr and relaxed the restrictions for
use_device_ptr, including processing non-type(c_ptr) arguments as
if has_device_addr was used. (There is a semantic difference.)
For completeness, the likewise change was done for 'use_device_ptr',
where non-type(c_ptr) arguments now use use_device_addr.
Finally, a warning for 'device(omp_{initial,invalid}_device)' was
silenced on the way as affecting the new testcase.
PR fortran/105318
gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Update is_device_ptr restrictions
for OpenMP 5.1 and map to has_device_addr where applicable; map
use_device_ptr to use_device_addr where applicable.
Silence integer-range warning for device(omp_{initial,invalid}_device).
libgomp/ChangeLog:
* testsuite/libgomp.fortran/is_device_ptr-2.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/is_device_ptr-1.f90: Remove dg-error.
* gfortran.dg/gomp/is_device_ptr-2.f90: Likewise.
* gfortran.dg/gomp/is_device_ptr-3.f90: Update tree-scan-dump.
|
|
This patch changes c_tree_equal to work more like cp_tree_equal, be
more strict in what it accepts. The ICE on the first testcase was
due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which
ICEs if the two constants have different precision, but as the second
testcase shows, being too lenient in it can also lead to miscompilation
of valid OpenMP programs where we think certain expression is the same
even when it isn't and can be guaranteed at runtime to represent different
memory location. So, the patch looks through only NON_LVALUE_EXPRs
and for constants as well as casts requires that the types match before
actually comparing the constant values or recursing on the cast operands.
2022-09-24 Jakub Jelinek <jakub@redhat.com>
PR c/106981
gcc/c/
* c-typeck.cc (c_tree_equal): Only strip NON_LVALUE_EXPRs at the
start. For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and
t2 have different types.
gcc/testsuite/
* c-c++-common/gomp/pr106981.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/pr106981.c: New test.
|
|
This patch refactors struct sibling-list processing in gimplify.cc, and
adjusts some related mapping-clause processing in the Fortran FE and
omp-low.cc accordingly.
2022-09-13 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Don't create
GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER
mappings for POINTER_TYPE_P decls.
gcc/
* gimplify.cc (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS.
(GOMP_FIRSTPRIVATE_IMPLICIT): Renumber.
(insert_struct_comp_map): Refactor function into...
(build_omp_struct_comp_nodes): This new function. Remove list handling
and improve self-documentation.
(extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move
code to strip outer parts of address out of function, but strip no-op
conversions.
(omp_mapping_group): Add DELETED field for use during reindexing.
(omp_strip_components_and_deref, omp_strip_indirections): New functions.
(omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling.
(omp_gather_mapping_groups): Initialise DELETED field for new groups.
(omp_index_mapping_groups): Notice DELETED groups when (re)indexing.
(omp_siblist_insert_node_after, omp_siblist_move_node_after,
omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New
helper functions.
(omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT
node groups for sibling lists. Outlined from gimplify_scan_omp_clauses.
(omp_build_struct_sibling_lists): New function.
(gimplify_scan_omp_clauses): Remove struct_map_to_clause,
struct_seen_clause, struct_deref_set. Call
omp_build_struct_sibling_lists as pre-pass instead of handling sibling
lists in the function's main processing loop.
(gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS
handling, unused now.
* omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect
struct references, and references to pointers to structs also.
gcc/testsuite/
* g++.dg/goacc/member-array-acc.C: New test.
* g++.dg/gomp/member-array-omp.C: New test.
* g++.dg/gomp/target-3.C: Update expected output.
* g++.dg/gomp/target-lambda-1.C: Likewise.
* g++.dg/gomp/target-this-2.C: Likewise.
* c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here.
* c-c++-common/gomp/target-50.c: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
* testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
* testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move
test to here, make "run" test.
|
|
Reverse offload requests at least -misa=sm_35; with this patch, a warning
instead of an error is shown, still permitting reverse offload for all
other configured device types. This is achieved by not calling
GOMP_offload_register_ver (and stopping generating pointless 'static const char'
variables, once known.)
The tool_name as progname changes adds "nvptx " and "gcn " to the
"mkoffload: warning/error:" diagnostic.
gcc/ChangeLog:
* config/nvptx/mkoffload.cc (process): Replace a fatal_error by
a warning + not enabling offloading if -misa=sm_30 prevents
reverse offload.
(main): Use tool_name as progname for diagnostic.
* config/gcn/mkoffload.cc (main): Likewise.
libgomp/ChangeLog:
* libgomp.texi (Offload-Target Specifics: nvptx): Document
that reverse offload requires >= -march=sm_35.
* testsuite/libgomp.c-c++-common/requires-4.c: Build for nvptx
with -misa=sm_35.
* testsuite/libgomp.c-c++-common/requires-5.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-6.c: Likewise.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: Likewise.
* testsuite/libgomp.fortran/reverse-offload-1.f90: Likewise.
* testsuite/libgomp.c/reverse-offload-sm30.c: New test.
|
|
The thing is,
make check
or
make check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'"
in libgomp obj dir work fine, but
make -j32 -k check RUNTESTFLAGS="c.exp='icv-6.c' c++.exp='icv-6.c'"
fails.
The thing is that the testcase as written relies on OMP_NUM_THREADS not being
set in environment (as it takes priority over OMP_NUM_THREADS_ALL for the
host).
So, if either a user has OMP_NUM_THREADS=42 in the environment by himself, or
when doing make check with -jN, we trigger:
if test $$num_cpus -gt 8 && test -z "$$OMP_NUM_THREADS"; then \
OMP_NUM_THREADS=8; export OMP_NUM_THREADS; \
echo @@@ libgomp OMP_NUM_THREADS adjusted to 8 because of parallel
make check and too many CPUs; \
fi; \
in libgomp/testsuite/Makefile.am and so the test fails.
2022-09-12 Jakub Jelinek <jakub@redhat.com>
PR libgomp/106894
* testsuite/libgomp.c-c++-common/icv-6.c: Include string.h.
(main): Avoid tests for which corresponding non-_ALL suffixed variable
is in the environment, or for OMP_NUM_TEAMS on the device
OMP_NUM_TEAMS_DEV_?.
|
|
This patch considers the environment variable syntax extension for
device-specific variants of environment variables from OpenMP 5.1 (see
OpenMP 5.1 specification, p. 75 and p. 639). An environment variable (e.g.
OMP_NUM_TEAMS) can have different suffixes:
_DEV (e.g. OMP_NUM_TEAMS_DEV): affects all devices but not the host.
_DEV_<device> (e.g. OMP_NUM_TEAMS_DEV_42): affects only device with
number <device>.
no suffix (e.g. OMP_NUM_TEAMS): affects only the host.
In future OpenMP versions also suffix _ALL will be introduced (see discussion
https://github.com/OpenMP/spec/issues/3179). This is also considered in this
patch:
_ALL (e.g. OMP_NUM_TEAMS_ALL): affects all devices and the host.
The precedence is as follows (descending). For the host:
1. no suffix
2. _ALL
For devices:
1. _DEV_<device>
2. _DEV
3. _ALL
That means, _DEV_<device> is used whenever available. Otherwise _DEV is used if
available, and at last _ALL. If there is no value for any of the variable
variants, default values are used as already implemented before.
This patch concerns parsing (a), storing (b), output (c) and transmission to the
device (d):
(a) The actual number of devices and the numbering are not known when parsing
the environment variables. Thus all environment variables are iterated and
searched for device-specific ones.
(b) Only configured device-specific variables are stored. Thus, a linked list
is used.
(c) The output is done in omp_display_env (see specification p. 468f). Global
ICVs are tagged with [all], see https://github.com/OpenMP/spec/issues/3179.
ICVs which are not global but aren't handled device-specific yet are tagged
with [host]. omp_display_env outputs the initial values of the ICVs. That is
why a dedicated data structure is introduced for the inital values only
(gomp_initial_icv_list).
(d) Device-specific ICVs are transmitted to the device via GOMP_ADDITIONAL_ICVS.
libgomp/ChangeLog:
* config/gcn/icv-device.c (omp_get_default_device): Return device-
specific ICV.
(omp_get_max_teams): Added for GCN devices.
(omp_set_num_teams): Likewise.
(ialias): Likewise.
* config/nvptx/icv-device.c (omp_get_default_device): Return device-
specific ICV.
(omp_get_max_teams): Added for NVPTX devices.
(omp_set_num_teams): Likewise.
(ialias): Likewise.
* env.c (struct gomp_icv_list): New struct to store entries of initial
ICV values.
(struct gomp_offload_icv_list): New struct to store entries of device-
specific ICV values that are copied to the device and back.
(struct gomp_default_icv_values): New struct to store default values of
ICVs according to the OpenMP standard.
(parse_schedule): Generalized for different variants of OMP_SCHEDULE.
(print_env_var_error): Function that prints an error for invalid values
for ICVs.
(parse_unsigned_long_1): Removed getenv. Generalized.
(parse_unsigned_long): Likewise.
(parse_int_1): Likewise.
(parse_int): Likewise.
(parse_int_secure): Likewise.
(parse_unsigned_long_list): Likewise.
(parse_target_offload): Likewise.
(parse_bind_var): Likewise.
(parse_stacksize): Likewise.
(parse_boolean): Likewise.
(parse_wait_policy): Likewise.
(parse_allocator): Likewise.
(omp_display_env): Extended to output different variants of environment
variables.
(print_schedule): New helper function for omp_display_env which prints
the values of run_sched_var.
(print_proc_bind): New helper function for omp_display_env which prints
the values of proc_bind_var.
(enum gomp_parse_type): Collection of types used for parsing environment
variables.
(ENTRY): Preprocess string lengths of environment variables.
(OMP_VAR_CNT): Preprocess table size.
(OMP_HOST_VAR_CNT): Likewise.
(INT_MAX_STR_LEN): Constant for the maximal number of digits of a device
number.
(gomp_get_icv_flag): Returns if a flag for a particular ICV is set.
(gomp_set_icv_flag): Sets a flag for a particular ICV.
(print_device_specific_icvs): New helper function for omp_display_env to
print device specific ICV values.
(get_device_num): New helper function for parse_device_specific.
Extracts the device number from an environment variable name.
(get_icv_member_addr): Gets the memory address for a particular member
of an ICV struct.
(gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list.
(initialize_icvs): New function to initialize a gomp_initial_icvs
struct.
(add_initial_icv_to_list): Adds an ICV struct to gomp_initial_icv_list.
(startswith): Checks if a string starts with a given prefix.
(initialize_env): Extended to parse the new syntax of environment
variables.
* icv-device.c (omp_get_max_teams): Added.
(ialias): Likewise.
(omp_set_num_teams): Likewise.
* icv.c (omp_set_num_teams): Moved to icv-device.c.
(omp_get_max_teams): Likewise.
(ialias): Likewise.
* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Removed.
(GOMP_ADDITIONAL_ICVS): New target-side struct that
holds the designated ICVs of the target device.
* libgomp.h (enum gomp_icvs): Collection of ICVs.
(enum gomp_device_num): Definition of device numbers for _ALL, _DEV, and
no suffix.
(enum gomp_env_suffix): Collection of possible suffixes of environment
variables.
(struct gomp_initial_icvs): Contains all ICVs for which we need to store
initial values.
(struct gomp_default_icv):New struct to hold ICVs for which we need
to store initial values.
(struct gomp_icv_list): Definition of a linked list that is used for
storing ICVs for the devices and also for _DEV, _ALL, and without
suffix.
(struct gomp_offload_icvs): New struct to hold ICVs that are copied to
a device.
(struct gomp_offload_icv_list): Definition of a linked list that holds
device-specific ICVs that are copied to devices.
(gomp_get_initial_icv_item): Get a list item of gomp_initial_icv_list.
(gomp_get_icv_flag): Returns if a flag for a particular ICV is set.
* libgomp.texi: Updated.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Extended to read
further ICVs from the offload image.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
* target.c (gomp_get_offload_icv_item): Get a list item of
gomp_offload_icv_list.
(get_gomp_offload_icvs): New. Returns the ICV values
depending on the device num and the variable hierarchy.
(gomp_load_image_to_device): Extended to copy further ICVs to a device.
* testsuite/libgomp.c-c++-common/icv-5.c: New test.
* testsuite/libgomp.c-c++-common/icv-6.c: New test.
* testsuite/libgomp.c-c++-common/icv-7.c: New test.
* testsuite/libgomp.c-c++-common/icv-8.c: New test.
* testsuite/libgomp.c-c++-common/omp-display-env-1.c: New test.
* testsuite/libgomp.c-c++-common/omp-display-env-2.c: New test.
|
|
This patch implements doacross(sink: omp_cur_iteration - 1) that the
previous patchset emitted a sorry on during omp expansion.
It can be implemented with existing library functions.
To recap, depend(source)/doacross(source:)/doacross(source:omp_cur_iteration)
is implemented calling GOMP_doacross_post or GOMP_doacross_ull_post,
called with an array of long or unsigned long long elements, one for
all collapsed loops together and one for each further ordered loop if any.
We initialize that array in each thread when grabbing further set of iterations
and update it at the end of loops, so that it represents the current iteration
(as 0 based counters). When the worksharing loop is created, we tell the
library through another similar array the counts (the loop needs to be
rectangular) in each dimension, first element is count of all logical iterations
in the collapsed loops.
depend(sink:v1 op N1, v2 op N2, ...) is then implemented by conditionally calling
GOMP_doacross_wait/GOMP_doacross_ull_wait. For N? of 0 there is no check,
otherwise if it wants to wait in a particular dimension for a previous iteration,
we check that the corresponding iterator isn't the first one (or first few),
where the previous iterator in that dimension would be out of range, and similarly
for checking of next iteration in a dimension that it isn't the last one (or last few)
where it would be similarly out of bounds. Then the collapsed loop counters are
folded into a single 0 based counter (first argument) and then other 0 based
iterations counters on what iteration it should wait for.
Now, doacross(sink: omp_cur_iteration - 1) is supposed to wait for the previous
logical iteration in the combined iteration space of all ordered loops.
For the very first iteration in that combined iteration space it does nothing,
there is no previous iteration. And similarly it does nothing if there
are more ordered loops than collapsed loop and it isn't the first logical
iteration of the combined loops inside of the collapsed loops, because as implemented
we know the previous iteration in that case is always executed by the same thread
as the current one.
In the implementation, we use the same value as is stored in the first element
of the array for GOMP_doacross_post/GOMP_doacross_ull_post, if that value is 0,
we do nothing. The rest is different based on if ordered argument is equal to
collapse or not. If it is, then we otherwise call
GOMP_doacross_wait/GOMP_doacross_ull_wait with a single argument, one less than
that counter we compare against 0.
If ordered argument is bigger than collapse, we add a per-thread boolean variable
.first.N, which we set to true at the start of the outermost ordered loop inside
of the collapsed set of loops and set to false at the end of the innermost
ordered loop. If .first.N is false, we don't do anything (we know the previous
iteration was handled by the current thread and by my reading of the spec we don't
need to emit even a memory barrier in that case, because it is just synchronization
with the same thread), otherwise we call GOMP_doacross_wait/GOMP_doacross_ull_wait
with the first argument one less than the counter we compare against 0, and then
one less than 2nd and following counts if iterations we pass to the workshare
initialization. If say .counts.N passed to the workshare initialization is
{ 256, 13, 5, 2 } for collapse(3) ordered(6) loop, then
GOMP_doacross_post/GOMP_doacross_ull_post is called with arguments equal to
.ordereda.N[0] - 1, 12, 4, 1.
2022-09-08 Jakub Jelinek <jakub@redhat.com>
gcc/
* omp-expand.cc (expand_omp_ordered_sink): Add CONT_BB argument.
Add doacross(sink:omp_cur_iteration-1) support.
(expand_omp_ordered_source_sink): Clear counts[fd->ordered + 1].
Adjust expand_omp_ordered_sink caller.
(expand_omp_for_ordered_loops): If counts[fd->ordered + 1] is
non-NULL, set that variable to true at the start of outermost
non-collapsed loop and set it to false at the end of innermost
ordered loop.
(expand_omp_for_generic): If fd->ordered, allocate
1 + (fd->ordered - fd->collapse) further elements in counts array.
Copy to counts + 2 + fd->ordered the counts of fd->collapse ..
fd->ordered - 1 loop if any.
gcc/testsuite/
* c-c++-common/gomp/doacross-7.c: New test.
libgomp/
* libgomp.texi (OpenMP 5.2): Mention that omp_cur_iteration is now
fully supported.
* testsuite/libgomp.c/doacross-4.c: New test.
* testsuite/libgomp.c/doacross-5.c: New test.
* testsuite/libgomp.c/doacross-6.c: New test.
* testsuite/libgomp.c/doacross-7.c: New test.
|
|
gcc/ChangeLog:
* internal-fn.cc (expand_GOMP_TARGET_REV): New.
* internal-fn.def (GOMP_TARGET_REV): New.
* lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
'omp target device_ancestor_host' as in_other_partition and don't
error if absent.
* omp-low.cc (create_omp_child_function): Mark as 'noclone'.
* omp-expand.cc (expand_omp_target): For reverse offload, remove
sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
empty-body nohost function.
* omp-offload.cc (execute_omp_device_lower): Handle
IFN_GOMP_TARGET_REV.
(pass_omp_target_link::execute): For ACCEL_COMPILER, don't
nullify fn argument for reverse offload
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
refer to 'requires'.
* testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
* testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
* testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
* testsuite/libgomp.fortran/reverse-offload-1.f90: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
* c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to
scan-tree-dump-times.
* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
Likewise.
* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-serial.c: Likewise.
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
* c-c++-common/goacc/kernels-loop-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-3.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
* c-c++-common/goacc/kernels-loop-data.c: Likewise.
* c-c++-common/goacc/kernels-loop-g.c: Likewise.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
* c-c++-common/goacc/kernels-loop-n.c: Likewise.
* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
* c-c++-common/goacc/kernels-loop.c: Likewise.
* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise.
* gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-serial.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
* gfortran.dg/goacc/kernels-loop.f95: Likewise.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise.
|
|
gcc/ChangeLog:
PR middle-end/106548
* omp-low.cc (lower_rec_input_clauses): Use build_outer_var_ref
for 'simd' linear-step values that are variable.
libgomp/ChangeLog:
PR middle-end/106548
* testsuite/libgomp.c/linear-2.c: New test.
|
|
This run-time test test pointer-based iteration with collapse,
similar to the '(parallel) simd' test for PR106449 but for 'for'.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/pr106449-2.c: New test.
|
|
iterators [PR106449]
There were 2 issues visible on this new testcase, one that we didn't have
special POINTER_TYPE_P handling in a few spots of expand_omp_simd - for
pointers we need to use POINTER_PLUS_EXPR and need to have the non-pointer
part in sizetype, for non-rectangular loop on the other side we can rely on
multiplication factor 1, pointers can't be multiplied, without those changes
we'd ICE. The other issue was that we put n2 expression directly into a
comparison in a condition and regimplified that, for the &a[512] case that
and with gimplification being destructed that unfortunately meant modification
of original fd->loops[?].n2. Fixed by unsharing the expression. This was
causing a runtime failure on the testcase.
2022-07-29 Jakub Jelinek <jakub@redhat.com>
PR middle-end/106449
* omp-expand.cc (expand_omp_simd): Fix up handling of pointer
iterators in non-rectangular simd loops. Unshare fd->loops[i].n2
or n2 before regimplifying it inside of a condition.
* testsuite/libgomp.c-c++-common/pr106449.c: New test.
|
|
'libgomp.oacc-c-c++-common/reduction-5.c' [PR101551]
Fix-up for recent commit 06b2a2abe26554c6f9365676683d67368cbba206
"Enhance '_Pragma' diagnostics verification in OMP C/C++ test cases".
Supposedly it's the same issue as in
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101551#c2>, where I'd
noted that:
| [...] with an offloading-enabled build of GCC we're losing
| "note: in expansion of macro '[...]'" diagnostics.
| (Effectively '-ftrack-macro-expansion=0'?)
PR middle-end/101551
libgomp/
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: XFAIL
'offloading_enabled' diagnostics issue.
|
|
Follow-up to recent commit 0587cef3d7962a8b0f44779589ba2920dd3d71e5
"c: Fix location for _Pragma tokens [PR97498]".
gcc/testsuite/
* c-c++-common/gomp/pragma-3.c: Enhance '_Pragma' diagnostics
verification.
* c-c++-common/gomp/pragma-5.c: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Enhance
'_Pragma' diagnostics verification.
|
|
The handling of #pragma GCC diagnostic uses input_location, which is not always
as precise as needed; in particular the relative location of some tokens and a
_Pragma directive will crucially determine whether a given diagnostic is enabled
or suppressed in the desired way. PR97498 shows how the C frontend ends up with
input_location pointing to the beginning of the line containing a _Pragma()
directive, resulting in the wrong behavior if the diagnostic to be modified
pertains to some tokens found earlier on the same line. This patch fixes that by
addressing two issues:
a) libcpp was not assigning a valid location to the CPP_PRAGMA token
generated by the _Pragma directive.
b) C frontend was not setting input_location to something reasonable.
With this change, the C frontend is able to change input_location to point to
the _Pragma token as needed.
This is just a two-line fix (one for each of a) and b)), the testsuite changes
were needed only because the location on the tested warnings has been somewhat
improved, so the tests need to look for the new locations.
gcc/c/ChangeLog:
PR preprocessor/97498
* c-parser.cc (c_parser_pragma): Set input_location to the
location of the pragma, rather than the start of the line.
libcpp/ChangeLog:
PR preprocessor/97498
* directives.cc (destringize_and_run): Override the location of
the CPP_PRAGMA token from a _Pragma directive to the location of
the expansion point, as is done for the tokens lexed from it.
gcc/testsuite/ChangeLog:
PR preprocessor/97498
* c-c++-common/pr97498.c: New test.
* c-c++-common/gomp/pragma-3.c: Adapt for improved warning locations.
* c-c++-common/gomp/pragma-5.c: Likewise.
* gcc.dg/pragma-message.c: Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Adapt for
improved warning locations.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Likewise.
|
|
Fix-up for recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
gcc/
* lto-cgraph.cc (input_offload_tables) <LTO_symtab_edge>: Correct
'fn2' computation.
libgomp/
* testsuite/libgomp.c-c++-common/requires-1.c: Add 'dg-note's.
* testsuite/libgomp.c-c++-common/requires-2.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-3.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-7.c: Likewise.
* testsuite/libgomp.fortran/requires-1.f90: Likewise.
|
|
Similar to how the other 'mkoffload's got changed in
recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
This also means finally switching Intel MIC 'mkoffload' to
'GOMP_offload_register_ver', 'GOMP_offload_unregister_ver',
making 'GOMP_offload_register', 'GOMP_offload_unregister'
legacy entry points.
gcc/
* config/i386/intelmic-mkoffload.cc (generate_host_descr_file)
(prepare_target_image, main): Handle OpenMP 'requires'.
(generate_host_descr_file): Switch to 'GOMP_offload_register_ver',
'GOMP_offload_unregister_ver'.
libgomp/
* target.c (GOMP_offload_register, GOMP_offload_unregister):
Denote as legacy entry points.
* testsuite/lib/libgomp.exp
(check_effective_target_offload_target_any): New proc.
* testsuite/libgomp.c-c++-common/requires-1.c: Enable for
'offload_target_any'.
* testsuite/libgomp.c-c++-common/requires-3.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-7.c: Likewise.
* testsuite/libgomp.fortran/requires-1.f90: Likewise.
|
|
'libgomp.c-c++-common/requires-5.c' testing
These should compile and link and execute in all configurations; host-fallback
execution, which we may actually verify.
Follow-up to recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
libgomp/
* testsuite/libgomp.c-c++-common/requires-4.c: Enhance testing.
* testsuite/libgomp.c-c++-common/requires-5.c: Likewise.
|
|
As documented, this one does "Check diagnostic by device-compiler's lto1".
Indeed there are none when compiling with '-foffload=disable' with an
offloading-enabled compiler, so we should use 'offload_target_[...]', as
used in other similar test cases.
Follow-up to recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
libgomp/
* testsuite/libgomp.c-c++-common/requires-3.c: Adjust.
|
|
Handle reverse_offload, unified_address, and unified_shared_memory
requirements in libgomp by saving them alongside the offload table.
When the device lto1 runs, it extracts the data for mkoffload. The
latter than passes the value on to GOMP_offload_register_ver.
lto1 (either the host one, with -flto [+ ENABLE_OFFLOADING], or in the
offload-device lto1) also does the the consistency check is done,
erroring out when the 'omp requires' clause use is inconsistent.
For all in-principle supported devices, if a requirement cannot be fulfilled,
the device is excluded from the (supported) devices list. Currently, none of
those requirements are marked as supported for any of the non-host devices.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_target_data, c_parser_omp_target_update,
c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Set
OMP_REQUIRES_TARGET_USED.
(c_parser_omp_requires): Remove sorry.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (process_asm): Write '#include <stdint.h>'.
(process_obj): Pass omp_requires_mask to GOMP_offload_register_ver.
(main): Ask lto1 to obtain omp_requires_mask and pass it on.
* config/nvptx/mkoffload.cc (process, main): Likewise.
* lto-cgraph.cc (omp_requires_to_name): New.
(input_offload_tables): Save omp_requires_mask.
(output_offload_tables): Read it, check for consistency,
save value for mkoffload.
* omp-low.cc (lower_omp_target): Force output_offloadtables
call for OMP_REQUIRES_TARGET_USED.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_target_data,
cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data,
cp_parser_omp_target_update): Set OMP_REQUIRES_TARGET_USED.
(cp_parser_omp_requires): Remove sorry.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_requires): Remove sorry.
* parse.cc (decode_omp_directive): Don't regard 'declare target'
as target usage for 'omp requires'; add more flags to
omp_requires_mask.
include/ChangeLog:
* gomp-constants.h (GOMP_VERSION): Bump to 2.
(GOMP_REQUIRES_UNIFIED_ADDRESS, GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
GOMP_REQUIRES_REVERSE_OFFLOAD, GOMP_REQUIRES_TARGET_USED):
New defines.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_get_num_devices): Add
omp_requires_mask arg.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Likewise;
return -1 when device available but omp_requires_mask != 0.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise.
* oacc-host.c (host_get_num_devices, host_openacc_get_property):
Update call.
* oacc-init.c (resolve_device, acc_init_1, acc_shutdown_1,
goacc_attach_host_thread_to_device, acc_get_num_devices,
acc_set_device_num, get_property_any): Likewise.
* target.c (omp_requires_mask): New global var.
(gomp_requires_to_name): New.
(GOMP_offload_register_ver): Handle passed omp_requires_mask.
(gomp_target_init): Handle omp_requires_mask.
* libgomp.texi (OpenMP 5.0): Update requires impl. status.
(OpenMP 5.1): Add a missed item.
(OpenMP 5.2): Mark linear-clause change as supported in C/C++.
* testsuite/libgomp.c-c++-common/requires-1-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-1.c: New test.
* testsuite/libgomp.c-c++-common/requires-2-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-2.c: New test.
* testsuite/libgomp.c-c++-common/requires-3-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-3.c: New test.
* testsuite/libgomp.c-c++-common/requires-4-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-4.c: New test.
* testsuite/libgomp.c-c++-common/requires-5-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-5.c: New test.
* testsuite/libgomp.c-c++-common/requires-6.c: New test.
* testsuite/libgomp.c-c++-common/requires-7-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-7.c: New test.
* testsuite/libgomp.fortran/requires-1-aux.f90: New test.
* testsuite/libgomp.fortran/requires-1.f90: New test.
liboffloadmic/ChangeLog:
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_num_devices):
Return -1 when device available but omp_requires_mask != 0.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/requires-4.c: Update dg-*.
* c-c++-common/gomp/reverse-offload-1.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-2.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-3.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-5.c: Likewise.
* gfortran.dg/gomp/target-device-ancestor-3.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-2.f90: Likewise. Move
post-FE checks to ...
* gfortran.dg/gomp/target-device-ancestor-2a.f90: ... this new file.
* gfortran.dg/gomp/requires-8.f90: Update as we don't regard
'declare target' for the 'requires' usage requirement.
Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
The i variable is used inside of the parallel in:
#pragma omp simd safelen(32) private (v)
for (i = 0; i < 64; i++)
{
v = 3 * i;
ll[i] = u1 + v * u2[0] + u2[1] + x + y[0] + y[1] + v + h[0] + u3[i];
}
where i is predetermined linear (so while inside of the body
it is safe, private per SIMD lane var) the final value is written to
the shared variable, and in:
for (i = 0; i < 64; i++)
if (ll[i] != u1 + 3 * i * u2[0] + u2[1] + x + y[0] + y[1] + 3 * i + 13 + 14 + i)
#pragma omp atomic write
err = 1;
which is a normal loop and so it isn't in any way privatized there.
So we have a data race, fixed by adding private (i) clause to the
parallel.
2022-06-21 Jakub Jelinek <jakub@redhat.com>
Paul Iannetta <piannetta@kalrayinc.com>
PR libgomp/106045
* testsuite/libgomp.c/target-31.c: Add private (i) clause.
|
|
On Tue, Jun 14, 2022 at 06:41:37PM +0200, Thomas Schwinge wrote:
> In an offloading configuration, I'm seeing:
>
> PASS: libgomp.fortran/get-mapped-ptr-1.f90 -O (test for excess errors)
> [-PASS:-]{+FAIL:+} libgomp.fortran/get-mapped-ptr-1.f90 -O execution test
>
> Does that one need similar treatment?
I assume not just that but libgomp.c-c++-common/get-mapped-ptr-1.c too?
It both needs the same treatment, and in the get-mapped-ptr-1.c
case there is even UB, while the Fortran version was using c_loc (q)
as the host pointer, in C/C++ it was using q which was value of
uninitialized pointer.
2022-06-15 Jakub Jelinek <jakub@redhat.com>
* testsuite/libgomp.c-c++-common/get-mapped-ptr-1.c (main): Initialize
q to ddress of an automatic variable. Use -5 instead of -1 in
omp_get_mapped_ptr call. Add test with omp_initial_device.
* testsuite/libgomp.fortran/get-mapped-ptr-1.f90 (main): Use -5 instead
of -1 in omp_get_mapped_ptr call. Add test with omp_initial_device.
Renumber stop arguments afterwards.
|
|
OpenMP 5.2 changed once more what device numbers are allowed.
In 5.1, valid device numbers were [0, omp_get_num_devices()].
5.2 makes also -1 valid (calls it omp_initial_device), which is equivalent
in behavior to omp_get_num_devices() number but has the advantage that it
is a constant. And it also introduces omp_invalid_device which is
also a constant with implementation defined value < -1. That value should
act like sNaN, any time any device construct (GOMP_target*) or OpenMP runtime
API routine is asked for such a device, the program is terminated.
And if OMP_TARGET_OFFLOAD=mandatory, all non-conforming device numbers (which
is all but [-1, omp_get_num_devices()] other than omp_invalid_device)
must be treated like omp_invalid_device.
For device constructs, we have a compatibility problem, we've historically
used 2 magic negative values to mean something special.
GOMP_DEVICE_ICV (-1) means device clause wasn't present, pick the
omp_get_default_device () number
GOMP_DEVICE_FALLBACK (-2) means the host device (this is used e.g. for
#pragma omp target if (cond)
where if cond is false, we pass -2
But 5.2 requires that omp_initial_device is -1 (there were discussions
about it, advantage of -1 is that one can say iterate over the
[-1, omp_get_num_devices()-1] range to get all devices starting with
the host/initial one.
And also, if user passes -2, unless it is omp_invalid_device, we need to
treat it like non-conforming with OMP_TARGET_OFFLOAD=mandatory.
So, the patch does on the compiler side some number remapping,
user_device_num >= -2U ? user_device_num - 1 : user_device_num.
This remapping is done at compile time if device clause has constant
argument, otherwise at runtime, and means that for user -1 (omp_initial_device)
we pass -2 to GOMP_* in the runtime library where it treats it like host
fallback, while -2 is remapped to -3 (one of the non-conforming device numbers,
for those it doesn't matter which one is which).
omp_invalid_device is then -4.
For the OpenMP device runtime APIs, no remapping is done.
This patch doesn't deal with the initial default-device-var for
OMP_TARGET_OFFLOAD=mandatory , the spec says that the inital ICV value
for that should in that case depend on whether there are any offloading
devices or not (if not, should be omp_invalid_device), but that means
we can't determine the number of devices lazily (and let libraries have the
possibility to register their offloading data etc.).
2022-06-13 Jakub Jelinek <jakub@redhat.com>
gcc/
* omp-expand.cc (expand_omp_target): Remap user provided
device clause arguments, -1 to -2 and -2 to -3, either
at compile time if constant, or at runtime.
include/
* gomp-constants.h (GOMP_DEVICE_INVALID): Define.
libgomp/
* omp.h.in (omp_initial_device, omp_invalid_device): New enumerators.
* omp_lib.f90.in (omp_initial_device, omp_invalid_device): New
parameters.
* omp_lib.h.in (omp_initial_device, omp_invalid_device): Likewise.
* target.c (resolve_device): Add remapped argument, handle
GOMP_DEVICE_ICV only if remapped is true (and clear remapped),
for negative values, treat GOMP_DEVICE_FALLBACK as fallback only
if remapped, otherwise treat omp_initial_device that way. For
omp_invalid_device, always emit gomp_fatal, even when
OMP_TARGET_OFFLOAD isn't mandatory.
(GOMP_target, GOMP_target_ext, GOMP_target_data, GOMP_target_data_ext,
GOMP_target_update, GOMP_target_update_ext,
GOMP_target_enter_exit_data): Pass true as remapped argument to
resolve_device.
(omp_target_alloc, omp_target_free, omp_target_is_present,
omp_target_memcpy_check, omp_target_associate_ptr,
omp_target_disassociate_ptr, omp_get_mapped_ptr,
omp_target_is_accessible): Pass false as remapped argument to
resolve_device. Treat omp_initial_device the same as
gomp_get_num_devices (). Don't bypass resolve_device calls if
device_num is negative.
(omp_pause_resource): Treat omp_initial_device the same as
gomp_get_num_devices (). Call resolve_device.
* icv-device.c (omp_set_default_device): Always set to device_num
even when it is negative.
* libgomp.texi: Document that Conforming device numbers,
omp_initial_device and omp_invalid_device is implemented.
* testsuite/libgomp.c/target-41.c (main): Add test with
omp_initial_device.
* testsuite/libgomp.c/target-45.c: New test.
* testsuite/libgomp.c/target-46.c: New test.
* testsuite/libgomp.c/target-47.c: New test.
* testsuite/libgomp.c-c++-common/target-is-accessible-1.c (main): Add
test with omp_initial_device. Use -5 instead of -1 for negative value
test.
* testsuite/libgomp.fortran/target-is-accessible-1.f90 (main):
Likewise. Reorder stop numbers.
|
|
option
That means, exposing to the user only the '--without-cuda-driver' behavior:
including the GCC-shipped 'include/cuda/cuda.h' (not system <cuda.h>), and
'dlopen'ing the CUDA Driver library (not linking it).
For development purposes, the libgomp nvptx plugin developer may still manually
override that, to get the previous '--with-cuda-driver' behavior.
libgomp/
* plugin/Makefrag.am: Evaluate 'if PLUGIN_NVPTX_DYNAMIC' to true.
* plugin/configfrag.ac (--with-cuda-driver)
(--with-cuda-driver-include, --with-cuda-driver-lib)
(CUDA_DRIVER_INCLUDE, CUDA_DRIVER_LIB, PLUGIN_NVPTX_CPPFLAGS)
(PLUGIN_NVPTX_LDFLAGS, PLUGIN_NVPTX_LIBS, PLUGIN_NVPTX_DYNAMIC):
Remove.
* testsuite/libgomp-test-support.exp.in (cuda_driver_include)
(cuda_driver_lib): Remove.
* testsuite/lib/libgomp.exp (libgomp_init): Don't consider these.
* Makefile.in: Regenerate.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.
|
|
construct
Fortran commit to C/C++/backend commit
r13-862-gf38b20d68fade5a922b9f68c4c3841e653d1b83c
gcc/fortran/ChangeLog:
* openmp.cc (OMP_SCOPE_CLAUSES): Add firstprivate and allocate.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark scope w/ firstprivate/allocate as Y.
* testsuite/libgomp.fortran/scope-2.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/scope-5.f90: New test.
* gfortran.dg/gomp/scope-6.f90: New test.
|
|
This patch adds support to gcc's diagnostic subsystem for emitting
diagnostics in SARIF, aka the Static Analysis Results Interchange Format:
https://sarifweb.azurewebsites.net/
by extending -fdiagnostics-format= to add two new options:
-fdiagnostics-format=sarif-stderr
and:
-fdiagnostics-format=sarif-file
The patch targets SARIF v2.1.0
This is a JSON-based format suited for capturing the results of static
analysis tools (like GCC's -fanalyzer), but it can also be used for plain
GCC warnings and errors.
SARIF supports per-event metadata in diagnostic paths such as
["acquire", "resource"] and ["release", "lock"] (specifically, the
threadFlowLocation "kinds" property: SARIF v2.1.0 section 3.38.8), so
the patch extends GCC"s diagnostic_event subclass with a "struct meaning"
with similar purpose. The patch implements this for -fanalyzer so that
the various state-machine-based warnings set these in the SARIF output.
The heart of the implementation is in the new file
diagnostic-format-sarif.cc. Much of the rest of the patch is interface
classes, isolating the diagnostic subsystem (which has no knowledge of
e.g. tree or langhook) from the "client" code in the compiler proper
cc1 etc).
The patch adds a langhook for specifying the SARIF v2.1.0
"artifact.sourceLanguage" property, based on the list in
SARIF v2.1.0 Appendix J.
The patch adds automated DejaGnu tests to our testsuite via new
scan-sarif-file and scan-sarif-file-not directives (although these
merely use regexps, rather than attempting to use a proper JSON parser).
I've tested the patch by hand using the validator at:
https://sarifweb.azurewebsites.net/Validation
and the react-based viewer at:
https://microsoft.github.io/sarif-web-component/
which successfully shows most of the information (although not paths,
and not CWE IDs), and I've fixed all validation errors I've seen (though
bugs no doubt remain).
I've also tested the generated SARIF using the VS Code extension linked
to from the SARIF website; I'm a novice with VS Code, but it seems to be
able to handle my generated SARIF files (e.g. showing the data in the
SARIF tab, and showing squiggly underlines under issues, and when I
click on them, it visualizes the events in the path inline within the
source window).
Has anyone written an Emacs mode for SARIF files? (pretty please)
gcc/ChangeLog:
* Makefile.in (OBJS): Add tree-diagnostic-client-data-hooks.o and
tree-logical-location.o.
(OBJS-libcommon): Add diagnostic-format-sarif.o; reorder.
(CFLAGS-tree-diagnostic-client-data-hooks.o): Add TARGET_NAME.
* common.opt (fdiagnostics-format=): Add sarif-stderr and sarif-file.
(sarif-stderr, sarif-file): New enum values.
* diagnostic-client-data-hooks.h: New file.
* diagnostic-format-sarif.cc: New file.
* diagnostic-path.h (enum diagnostic_event::verb): New enum.
(enum diagnostic_event::noun): New enum.
(enum diagnostic_event::property): New enum.
(struct diagnostic_event::meaning): New struct.
(diagnostic_event::get_logical_location): New vfunc.
(diagnostic_event::get_meaning): New vfunc.
(simple_diagnostic_event::get_logical_location): New vfunc impl.
(simple_diagnostic_event::get_meaning): New vfunc impl.
* diagnostic.cc: Include "diagnostic-client-data-hooks.h".
(diagnostic_initialize): Initialize m_client_data_hooks.
(diagnostic_finish): Clean up m_client_data_hooks.
(diagnostic_event::meaning::dump_to_pp): New.
(diagnostic_event::meaning::maybe_get_verb_str): New.
(diagnostic_event::meaning::maybe_get_noun_str): New.
(diagnostic_event::meaning::maybe_get_property_str): New.
(get_cwe_url): Make non-static.
(diagnostic_output_format_init): Handle
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
* diagnostic.h (enum diagnostics_output_format): Add
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
(class diagnostic_client_data_hooks): New forward decl.
(class logical_location): New forward decl.
(diagnostic_context::m_client_data_hooks): New field.
(diagnostic_output_format_init_sarif_stderr): New decl.
(diagnostic_output_format_init_sarif_file): New decl.
(get_cwe_url): New decl.
* doc/invoke.texi (-fdiagnostics-format=): Add sarif-stderr and
sarif-file.
* doc/sourcebuild.texi (Scan a particular file): Add
scan-sarif-file and scan-sarif-file-not.
* langhooks-def.h (lhd_get_sarif_source_language): New decl.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): New macro.
(LANG_HOOKS_INITIALIZER): Add
LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE.
* langhooks.cc (lhd_get_sarif_source_language): New.
* langhooks.h (lang_hooks::get_sarif_source_language): New field.
* logical-location.h: New file.
* plugin.cc (struct for_each_plugin_closure): New.
(for_each_plugin_cb): New.
(for_each_plugin): New.
* plugin.h (for_each_plugin): New decl.
* tree-diagnostic-client-data-hooks.cc: New file.
* tree-diagnostic.cc: Include "diagnostic-client-data-hooks.h".
(tree_diagnostics_defaults): Populate m_client_data_hooks.
* tree-logical-location.cc: New file.
* tree-logical-location.h: New file.
gcc/ada/ChangeLog:
* gcc-interface/misc.cc (gnat_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/analyzer/ChangeLog:
* checker-path.cc (checker_event::get_meaning): New.
(function_entry_event::get_meaning): New.
(state_change_event::get_desc): Add dump of meaning of the event
to the -fanalyzer-verbose-state-changes output.
(state_change_event::get_meaning): New.
(cfg_edge_event::get_meaning): New.
(call_event::get_meaning): New.
(return_event::get_meaning): New.
(start_consolidated_cfg_edges_event::get_meaning): New.
(warning_event::get_meaning): New.
* checker-path.h: Include "tree-logical-location.h".
(checker_event::checker_event): Construct m_logical_loc.
(checker_event::get_logical_location): New.
(checker_event::get_meaning): New decl.
(checker_event::m_logical_loc): New.
(function_entry_event::get_meaning): New decl.
(state_change_event::get_meaning): New decl.
(cfg_edge_event::get_meaning): New decl.
(call_event::get_meaning): New decl.
(return_event::get_meaning): New decl.
(start_consolidated_cfg_edges_event::get_meaning): New.
(warning_event::get_meaning): New decl.
* pending-diagnostic.h: Include "diagnostic-path.h".
(pending_diagnostic::get_meaning_for_state_change): New vfunc.
* sm-file.cc (file_diagnostic::get_meaning_for_state_change): New
vfunc impl.
* sm-malloc.cc (malloc_diagnostic::get_meaning_for_state_change):
Likewise.
* sm-sensitive.cc
(exposure_through_output_file::get_meaning_for_state_change):
Likewise.
* sm-taint.cc (taint_diagnostic::get_meaning_for_state_change):
Likewise.
* varargs.cc
(va_list_sm_diagnostic::get_meaning_for_state_change): Likewise.
gcc/c/ChangeLog:
* c-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(c_get_sarif_source_language): New.
* c-tree.h (c_get_sarif_source_language): New decl.
gcc/cp/ChangeLog:
* cp-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(cp_get_sarif_source_language): New.
gcc/d/ChangeLog:
* d-lang.cc (d_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/fortran/ChangeLog:
* f95-lang.cc (gfc_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/go/ChangeLog:
* go-lang.cc (go_get_sarif_source_language): New.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
gcc/objc/ChangeLog:
* objc-act.h (objc_get_sarif_source_language): New decl.
* objc-lang.cc (LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): Redefine.
(objc_get_sarif_source_language): New.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-sarif-file-1.c: New test.
* c-c++-common/diagnostic-format-sarif-file-2.c: New test.
* c-c++-common/diagnostic-format-sarif-file-3.c: New test.
* c-c++-common/diagnostic-format-sarif-file-4.c: New test.
* gcc.dg/analyzer/file-meaning-1.c: New test.
* gcc.dg/analyzer/malloc-meaning-1.c: New test.
* gcc.dg/analyzer/malloc-sarif-1.c: New test.
* gcc.dg/plugin/analyzer_gil_plugin.c
(gil_diagnostic::get_meaning_for_state_change): New vfunc impl.
* gcc.dg/plugin/diagnostic-test-paths-5.c: New test.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add
diagnostic-test-paths-5.c to tests for
diagnostic_plugin_test_paths.c.
* lib/gcc-dg.exp: Load scansarif.exp.
* lib/scansarif.exp: New test.
libatomic/ChangeLog:
* testsuite/lib/libatomic.exp: Add load_gcc_lib of scansarif.exp.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp: Add load_gcc_lib of scansarif.exp.
libitm/ChangeLog:
* testsuite/lib/libitm.exp: Add load_gcc_lib of scansarif.exp.
libphobos/ChangeLog:
* testsuite/lib/libphobos-dg.exp: Add load_gcc_lib of scansarif.exp.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
OpenMP 5.2 adds support for firstprivate and allocate clauses on the scope
construct and this patch adds that support to GCC.
5.2 unfortunately (IMNSHO mistakenly) marked scope construct as worksharing,
which implies that it isn't possible to nest inside of it other scope,
worksharing loop, sections, explicit barriers, single etc. which would
make scope far less useful. I'm not implementing that part, keeping the
5.1 behavior here, and will file an issue to revert that for OpenMP 6.0.
But, for firstprivate it keeps the restriction that is now implied from
worksharing construct that listed var can't be private in outer context,
where for reduction 5.1 had similar restriction explicit even for scope
and 5.2 has it implicitly through worksharing construct.
2022-05-31 Jakub Jelinek <jakub@redhat.com>
gcc/
* omp-low.cc (build_outer_var_ref): For code == OMP_CLAUSE_ALLOCATE
allow var to be private in the outer context.
(lower_private_allocate): Pass OMP_CLAUSE_ALLOCATE as last argument
to build_outer_var_ref.
gcc/c/
* c-parser.cc (OMP_SCOPE_CLAUSE_MASK): Add firstprivate and allocate
clauses.
gcc/cp/
* parser.cc (OMP_SCOPE_CLAUSE_MASK): Add firstprivate and allocate
clauses.
gcc/testsuite/
* c-c++-common/gomp/scope-5.c: New test.
* c-c++-common/gomp/scope-6.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Add firstprivate and allocate clauses
to scope construct.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/allocate-1.c (foo): Add testcase for
scope construct with allocate clause.
* testsuite/libgomp.c-c++-common/allocate-3.c (foo): Likewise.
* testsuite/libgomp.c-c++-common/scope-2.c: New test.
|
|
Fortran version to C/C++ commit r13-797-g0ccba4ed8571c18c7015413441e971
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_clauses): Handle OMP_LIST_ENTER.
* gfortran.h: Add OMP_LIST_ENTER.
* openmp.cc (enum omp_mask2, OMP_DECLARE_TARGET_CLAUSES): Add
OMP_CLAUSE_ENTER.
(gfc_match_omp_clauses, gfc_match_omp_declare_target,
resolve_omp_clauses): Handle 'enter' clause.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark 'enter' clause as supported.
* testsuite/libgomp.fortran/declare-target-1.f90: Extend to test
explicit 'to' and 'enter' clause.
* testsuite/libgomp.fortran/declare-target-2.f90: Update accordingly.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/declare-target-2.f90: Add 'enter' clause test.
* gfortran.dg/gomp/declare-target-4.f90: Likewise.
|
|
OpenMP 5.1 and earlier had 2 different uses of to clause, one for target
update construct with one semantics, and one for declare target directive
with a different semantics.
Under the hood we were using OMP_CLAUSE_TO_DECLARE to represent the latter.
OpenMP 5.2 renamed the declare target clause to to enter, the old one is
kept as a deprecated alias.
As we are far from having full OpenMP 5.2 support, this patch adds support
for the enter clause (and renames OMP_CLAUSE_TO_DECLARE to OMP_CLAUSE_ENTER
with a flag to tell the spelling of the clause for better diagnostics),
but doesn't deprecate the to clause on declare target just yet (that
should be done as one of the last steps in 5.2 support).
2022-05-27 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TO_DECLARE
to OMP_CLAUSE_ENTER.
* tree.h (OMP_CLAUSE_ENTER_TO): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Rename
OMP_CLAUSE_TO_DECLARE to OMP_CLAUSE_ENTER.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_ENTER
instead of OMP_CLAUSE_TO_DECLARE, if OMP_CLAUSE_ENTER_TO, print
"to" instead of "enter".
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_ENTER.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Parse enter clause.
(c_parser_omp_all_clauses): For to clause on declare target, use
OMP_CLAUSE_ENTER clause with OMP_CLAUSE_ENTER_TO instead of
OMP_CLAUSE_TO_DECLARE clause. Handle PRAGMA_OMP_CLAUSE_ENTER.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add enter clause.
(c_parser_omp_declare_target): Use OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_ENTER instead
of OMP_CLAUSE_TO_DECLARE, to OMP_CLAUSE_ENTER_TO use "to" as clause
name in diagnostics instead of
omp_clause_code_name[OMP_CLAUSE_CODE (c)].
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Parse enter clause.
(cp_parser_omp_all_clauses): For to clause on declare target, use
OMP_CLAUSE_ENTER clause with OMP_CLAUSE_ENTER_TO instead of
OMP_CLAUSE_TO_DECLARE clause. Handle PRAGMA_OMP_CLAUSE_ENTER.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add enter clause.
(cp_parser_omp_declare_target): Use OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
* semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_ENTER instead
of OMP_CLAUSE_TO_DECLARE, to OMP_CLAUSE_ENTER_TO use "to" as clause
name in diagnostics instead of
omp_clause_code_name[OMP_CLAUSE_CODE (c)].
gcc/testsuite/
* c-c++-common/gomp/clauses-3.c: Add tests with enter clause instead
of to or modify some existing to clauses to enter.
* c-c++-common/gomp/declare-target-1.c: Likewise.
* c-c++-common/gomp/declare-target-2.c: Likewise.
* c-c++-common/gomp/declare-target-3.c: Likewise.
* g++.dg/gomp/attrs-9.C: Likewise.
* g++.dg/gomp/declare-target-1.C: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/target-40.c: Modify some existing to
clauses to enter.
* testsuite/libgomp.c/target-41.c: Likewise.
|
|
Richi reported occassional hangs with taskwait-depend-nowait-1.*
tests and I've finally manged to reproduce. The problem is if
taskwait depend without nowait is encountered soon after
taskwait depend nowait and the former depends on the latter and there
is no other work to do, the taskwait depend without nowait is put
to sleep, but the empty_task optimization in
gomp_task_run_post_handle_dependers wouldn't wake it up in that
case. gomp_task_run_post_handle_dependers normally does some wakeups
because it schedules more work (another task), which is not the
case of empty_task, but we need to do the wakeups that would be done
upon task completion so that we awake sleeping threads when the
last child is done.
So, the taskwait-depend-nowait-1.* testcase is fixed with the
else if (__builtin_expect (task->parent_depends_on, 0) part of
the patch.
The new testcase can hang on another problem, if the empty task
is the last task of a taskgroup, we need to use atomic store
like elsewhere to decrease the counter to 0, and wake up taskgroup
end if needed.
Yet another spot which can sleep is normal taskwait (without depend),
but I believe nothing needs to be done for that - in that case we
await solely until the children's queue has no tasks, tasks still
waiting for dependencies aren't accounted in that, but the reason
is that if taskwait should wait for something, there needs to be at least
one active child doing something (in the children queue), which then
possibly awakes some of its siblings when the dependencies are met,
or in the empty task case awakes further dependencies, but in any
case the child that finished is still handled as active child and
will awake taskwait at the end if there is nothing further to
do.
Last sleeping case are barriers, but that is handled by ++ret and
awaking the barrier.
2022-05-25 Jakub Jelinek <jakub@redhat.com>
* task.c (gomp_task_run_post_handle_dependers): If empty_task
is the last task taskwait depend depends on, wake it up.
Similarly if it is the last child of a taskgroup, use atomic
store instead of decrement and awak taskgroup wait if any.
* testsuite/libgomp.c-c++-common/taskwait-depend-nowait-2.c: New test.
|
|
Fortran part to C/C++/libgomp
commit r13-724-gb43836914bdc2a37563cf31359b2c4803bfe4374
gcc/fortran/
PR c/105378
* openmp.cc (gfc_match_omp_taskwait): Accept nowait.
gcc/testsuite/
PR c/105378
* gfortran.dg/gomp/taskwait-depend-nowait-1.f90: New.
libgomp/
PR c/105378
* libgomp.texi (OpenMP 5.1): Set 'taskwait nowait' to 'Y'.
* testsuite/libgomp.fortran/taskwait-depend-nowait-1.f90: New.
|
|
This patch adds support for (so far C/C++)
#pragma omp taskwait nowait depend(...)
directive, which is like
#pragma omp task depend(...)
;
but slightly optimized on the library side, so that it creates
the task only for the purpose of dependency tracking and doesn't actually
schedule it and wait for it when the dependencies are satisfied, instead
makes its dependencies satisfied right away.
2022-05-24 Jakub Jelinek <jakub@redhat.com>
PR c/105378
gcc/
* omp-builtins.def (BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT): New
builtin.
* gimplify.cc (gimplify_omp_task): Diagnose taskwait with nowait
clause but no depend clauses.
* omp-expand.cc (expand_taskwait_call): Use
BUILT_IN_GOMP_TASKWAIT_DEPEND_NOWAIT rather than
BUILT_IN_GOMP_TASKWAIT_DEPEND if nowait clause is present.
gcc/c/
* c-parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/cp/
* parser.cc (OMP_TASKWAIT_CLAUSE_MASK): Add nowait clause.
gcc/testsuite/
* c-c++-common/gomp/taskwait-depend-nowait-1.c: New test.
libgomp/
* libgomp_g.h (GOMP_taskwait_depend_nowait): Declare.
* libgomp.map (GOMP_taskwait_depend_nowait): Export at GOMP_5.1.1.
* task.c (empty_task): New function.
(gomp_task_run_post_handle_depend_hash): Declare earlier.
(gomp_task_run_post_handle_depend): Declare.
(GOMP_task): Optimize fn == empty_task if there is nothing to wait
for.
(gomp_task_run_post_handle_dependers): Optimize task->fn == empty_task.
(GOMP_taskwait_depend_nowait): New function.
* testsuite/libgomp.c-c++-common/taskwait-depend-nowait-1.c: New test.
|
|
For allocatable/pointer arrays, a firstprivate to a device
not only needs to privatize the descriptor but also the actual
data. This is implemented as:
firstprivate(x) firstprivate(x.data) attach(x [bias: &x.data-&x)
where the address of x in device memory is saved in hostaddrs[i]
by libgomp and the middle end actually passes hostaddrs[i]' to
attach.
As side effect, has_device_addr(array_desc) had to be changed:
before, it was converted to firstprivate in the front end; now
it is handled in omp-low.cc as has_device_addr requires a shallow
firstprivate (not touching the data pointer) while the normal
firstprivate requires (now) a deep firstprivate.
gcc/fortran/ChangeLog:
PR fortran/104949
* f95-lang.cc (LANG_HOOKS_OMP_ARRAY_SIZE): Redefine.
* trans-openmp.cc (gfc_omp_array_size): New.
(gfc_trans_omp_variable_list): Never turn has_device_addr
to firstprivate.
* trans.h (gfc_omp_array_size): New.
gcc/ChangeLog:
PR fortran/104949
* langhooks-def.h (lhd_omp_array_size): New.
(LANG_HOOKS_OMP_ARRAY_SIZE): Define.
(LANG_HOOKS_DECLS): Add it.
* langhooks.cc (lhd_omp_array_size): New.
* langhooks.h (struct lang_hooks_for_decls): Add hook.
* omp-low.cc (scan_sharing_clauses, lower_omp_target):
Handle GOMP_MAP_FIRSTPRIVATE for array descriptors.
libgomp/ChangeLog:
PR fortran/104949
* target.c (gomp_map_vars_internal, copy_firstprivate_data):
Support attach for GOMP_MAP_FIRSTPRIVATE.
* testsuite/libgomp.fortran/target-firstprivate-1.f90: New test.
* testsuite/libgomp.fortran/target-firstprivate-2.f90: New test.
* testsuite/libgomp.fortran/target-firstprivate-3.f90: New test.
|
|
omp_target_memcpy_rect_async
This patch adds two new OpenMP runtime routines: omp_target_memcpy_async and
omp_target_memcpy_rect_async. Both functions are introduced in OpenMP 5.1 as
asynchronous variants of omp_target_memcpy and omp_target_memcpy_rect.
In contrast to the synchronous variants, the asynchronous functions have two
additional function parameters to allow the specification of task dependences:
int depobj_count
omp_depend_t *depobj_list
integer(c_int), value :: depobj_count
integer(omp_depend_kind), optional :: depobj_list(*)
The implementation splits the synchronous functions into two parts: (a) check
and (b) copy. Then (a) is used in the asynchronous functions for the sequential
part, and the actual copy process (b) is executed in a new created task. The
sequential part (a) takes into account the requirements for the return values:
"The routine returns zero if successful. Otherwise, it returns a non-zero
value." (omp_target_memcpy_async, OpenMP 5.1 spec, section 3.8.7)
"An application can determine the number of inclusive dimensions supported by an
implementation by passing NULL pointers (or C_NULL_PTR, for Fortran) for both
dst and src. The routine returns the number of dimensions supported by the
implementation for the specified device numbers. No copy operation is
performed." (omp_target_memcpy_rect_async, OpenMP 5.1 spec, section 3.8.8)
Due to asynchronicity an error is thrown if the asynchronous memcpy is not
successful (in contrast to the synchronous functions which use a return
value unequal to zero).
gcc/ChangeLog:
* omp-low.cc (omp_runtime_api_call): Added target_memcpy_async and
target_memcpy_rect_async to omp_runtime_apis array.
libgomp/ChangeLog:
* libgomp.map: Added omp_target_memcpy_async and
omp_target_memcpy_rect_async.
* libgomp.texi: Both functions are now supported.
* omp.h.in: Added omp_target_memcpy_async and
omp_target_memcpy_rect_async.
* omp_lib.f90.in: Added interfaces for both new functions.
* omp_lib.h.in: Likewise.
* target.c (ialias_redirect): Added for GOMP_task.
(omp_target_memcpy): Restructured into check and copy part.
(omp_target_memcpy_check): New helper function for omp_target_memcpy and
omp_target_memcpy_async that checks requirements.
(omp_target_memcpy_copy): New helper function for omp_target_memcpy and
omp_target_memcpy_async that performs the memcpy.
(omp_target_memcpy_async_helper): New helper function that is used in
omp_target_memcpy_async for the asynchronous task.
(omp_target_memcpy_async): Added.
(omp_target_memcpy_rect): Restructured into check and copy part.
(omp_target_memcpy_rect_check): New helper function for
omp_target_memcpy_rect and omp_target_memcpy_rect_async that checks
requirements.
(omp_target_memcpy_rect_copy): New helper function for
omp_target_memcpy_rect and omp_target_memcpy_rect_async that performs
the memcpy.
(omp_target_memcpy_rect_async_helper): New helper function that is used
in omp_target_memcpy_rect_async for the asynchronous task.
(omp_target_memcpy_rect_async): Added.
* task.c (ialias): Added for GOMP_task.
* testsuite/libgomp.c-c++-common/target-memcpy-async-1.c: New test.
* testsuite/libgomp.c-c++-common/target-memcpy-async-2.c: New test.
* testsuite/libgomp.c-c++-common/target-memcpy-rect-async-1.c: New test.
* testsuite/libgomp.c-c++-common/target-memcpy-rect-async-2.c: New test.
* testsuite/libgomp.fortran/target-memcpy-async-1.f90: New test.
* testsuite/libgomp.fortran/target-memcpy-async-2.f90: New test.
* testsuite/libgomp.fortran/target-memcpy-rect-async-1.f90: New test.
* testsuite/libgomp.fortran/target-memcpy-rect-async-2.f90: New test.
|
|
Fortran additions to the C/C++ + ME/libgomp commit
r13-556-g2c16eb3157f86ae561468c540caf8eb326106b5f
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_omp_depend_op): Add OMP_DEPEND_INOUTSET.
(gfc_omp_clauses): Enlarge ENUM_BITFIELD.
* dump-parse-tree.cc (show_omp_namelist, show_omp_clauses): Handle
'inoutset' depend modifier.
* openmp.cc (gfc_match_omp_clauses, gfc_match_omp_depobj): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses, gfc_trans_omp_depobj):
Likewise.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Set 'inoutset' to Y.
(OpenMP Context Selectors): Add missing comma.
* testsuite/libgomp.fortran/depend-5.f90: Add inoutset test.
* testsuite/libgomp.fortran/depend-6.f90: Likewise.
* testsuite/libgomp.fortran/depend-7.f90: Likewise.
* testsuite/libgomp.fortran/depend-inoutset-1.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/all-memory-1.f90: Add inoutset test.
* gfortran.dg/gomp/all-memory-2.f90: Likewise.
* gfortran.dg/gomp/depobj-1.f90: Likewise.
* gfortran.dg/gomp/depobj-2.f90: Likewise.
|
|
This patch adds support for inoutset depend-kind in depend
clauses. It is very similar to the in depend-kind in that
a task with a dependency with that depend-kind is dependent
on all previously created sibling tasks with matching address
unless they have the same depend-kind.
In the in depend-kind case everything is dependent except
for in -> in dependency, for inoutset everything is
dependent except for inoutset -> inoutset dependency.
mutexinoutset is also similar (everything is dependent except
for mutexinoutset -> mutexinoutset dependency), but there is
also the additional restriction that only one task with
mutexinoutset for each address can be scheduled at once (i.e.
mutual exclusitivty). For now we support mutexinoutset
the same as inout/out, but the inoutset support is full.
In order not to bump the ABI for dependencies each time
(we've bumped it already once, the old ABI supports only
inout/out and in depend-kind, the new ABI supports
inout/out, mutexinoutset, in and depobj), this patch arranges
for inoutset to be at least for the time being always handled
as if it was specified through depobj even when it is not.
So it uses the new ABI for that and inoutset are represented
like depobj - pointer to a pair of pointers where the first one
will be the actual address of the object mentioned in depend
clause and second pointer will be (void *) GOMP_DEPEND_INOUTSET.
2022-05-17 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_depend_kind): Add
OMP_CLAUSE_DEPEND_INOUTSET.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
* gimplify.cc (gimplify_omp_depend): Likewise.
* omp-low.cc (lower_depend_clauses): Likewise.
gcc/c-family/
* c-omp.cc (c_finish_omp_depobj): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
gcc/c/
* c-parser.cc (c_parser_omp_clause_depend): Parse
inoutset depend-kind.
(c_parser_omp_depobj): Likewise.
gcc/cp/
* parser.cc (cp_parser_omp_clause_depend): Parse
inoutset depend-kind.
(cp_parser_omp_depobj): Likewise.
* cxx-pretty-print.cc (cxx_pretty_printer::statement): Handle
OMP_CLAUSE_DEPEND_INOUTSET.
gcc/testsuite/
* c-c++-common/gomp/all-memory-1.c (boo): Add test with
inoutset depend-kind.
* c-c++-common/gomp/all-memory-2.c (boo): Likewise.
* c-c++-common/gomp/depobj-1.c (f1): Likewise.
(f2): Adjusted expected diagnostics.
* g++.dg/gomp/depobj-1.C (f4): Adjust expected diagnostics.
include/
* gomp-constants.h (GOMP_DEPEND_INOUTSET): Define.
libgomp/
* libgomp.h (struct gomp_task_depend_entry): Change is_in type
from bool to unsigned char.
* task.c (gomp_task_handle_depend): Handle GOMP_DEPEND_INOUTSET.
Ignore dependencies where
task->depend[i].is_in && task->depend[i].is_in == ent->is_in
rather than just task->depend[i].is_in && ent->is_in. Remember
whether GOMP_DEPEND_IN loop is needed and guard the loop with that
conditional.
(gomp_task_maybe_wait_for_dependencies): Handle GOMP_DEPEND_INOUTSET.
Ignore dependencies where elem.is_in && elem.is_in == ent->is_in
rather than just elem.is_in && ent->is_in.
* testsuite/libgomp.c-c++-common/depend-1.c (test): Add task with
inoutset depend-kind.
* testsuite/libgomp.c-c++-common/depend-2.c (test): Likewise.
* testsuite/libgomp.c-c++-common/depend-3.c (test): Likewise.
* testsuite/libgomp.c-c++-common/depend-inoutset-1.c: New test.
|