Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
(non-shared memory system)
'GCN_SUPPRESS_HOST_FALLBACK' originated as 'HSA_SUPPRESS_HOST_FALLBACK' in the
libgomp HSA plugin, where the idea was -- in my understanding -- that you
wouldn't have device code available for all functions that may be called, and
in that case transparently (shared memory system!) do host-fallback execution.
Or, with 'HSA_SUPPRESS_HOST_FALLBACK' set, you'd get those diagnosed.
This has then been copied into the libgomp GCN plugin as
'GCN_SUPPRESS_HOST_FALLBACK'. However, the original meaning isn't applicable
for the libgomp GCN plugin anymore: we assume that we're generating device code
for all relevant functions, and we're implementing a non-shared memory system,
where we cannot transparently do host-fallback execution for individual
functions.
However, 'GCN_SUPPRESS_HOST_FALLBACK' has gained an additional meaning, to
enforce a fatal error in case that 'libhsa-runtime64.so.1' can't be dynamically
loaded; keep that meaning.
libgomp/
* plugin/plugin-gcn.c (GOMP_OFFLOAD_can_run): Don't consider
'GCN_SUPPRESS_HOST_FALLBACK' anymore (assume always-'true').
(init_hsa_context): Adjust 'GCN_SUPPRESS_HOST_FALLBACK' error
message.
|
|
Per commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp", we're now using 'return -1'
from 'GOMP_OFFLOAD_get_num_devices' for 'omp_requires_mask' purposes. This
missed that via 'nvptx_get_num_devices', we could also 'return -1' for
'cuDeviceGetCount' failure. Before, this meant (in 'gomp_target_init') to
silently ignore the plugin/device -- which also has been doubtful behavior.
Let's instead turn 'cuDeviceGetCount' failure into a fatal error, similar to
other errors during device initialization.
libgomp/
* plugin/plugin-nvptx.c (nvptx_get_num_devices):
'cuDeviceGetCount' failure is fatal.
|
|
'libcuda.so.1'
If 'libhsa-runtime64.so.1', 'libcuda.so.1' are not available, the corresponding
libgomp plugin/device gets disabled, as before. But if they are available,
report any inconsistencies such as missing symbols, similar to how we fail in
presence of other issues during device initialization.
libgomp/
* plugin/plugin-gcn.c (init_hsa_runtime_functions): Fatal error
for missing symbols.
* plugin/plugin-nvptx.c (init_cuda_lib): Likewise.
|
|
|
|
This reverts commit b14209715e659f6d3ca0f9eef9a4851e7bd6e373.
|
|
|
|
[PR114216]
For the type of the target callbacks we use elsehwere void (*) (void *) and
IMHO should use that for the reverse offload fallback as well (where the actual
callback is emitted using the same code as for host fallback or device kernel
entry routines), even when it is also ok to use void (*) () before C23 and
we aren't building libgomp with C23 yet. On some arches perhaps void (*) ()
could result in worse code generation because calls in that case like casts
to unprototyped functions need to sometimes pass argument in two different spots
etc. so that it deals with both passing it through ... and as a named argument.
2024-03-04 Jakub Jelinek <jakub@redhat.com>
PR libgomp/114216
* target.c (gomp_target_rev): Change host_fn type and corresponding
cast from void (*)() to void (*) (void *).
|
|
|
|
OpenMP permits '(first)private' for C++ member variables, which GCC handles
by tagging those by DECL_OMP_PRIVATIZED_MEMBER, adding a temporary VAR_DECL
and DECL_VALUE_EXPR pointing to the 'this->member_var' in the C++ front end.
The idea is that in omp-low.cc, the DECL_VALUE_EXPR is used before the
region (for 'firstprivate'; ignored for 'private') while in the region,
the DECL itself is used.
In gimplify, the value expansion is suppressed and deferred if the
lang_hooks.decls.omp_disregard_value_expr (decl, shared)
returns true - which is never the case if 'shared' is true. In OpenMP 4.5,
only 'map' and 'use_device_ptr' was permitted for the 'target' directive.
And when OpenMP 5.0's 'private'/'firstprivate' clauses was added, the
the update that now 'shared' argument could be false was missed. The
respective check has now been added.
2024-03-01 Jakub Jelinek <jakub@redhat.com>
Tobias Burnus <tburnus@baylibre.com>
PR c++/110347
gcc/ChangeLog:
* gimplify.cc (omp_notice_variable): Fix 'shared' arg to
lang_hooks.decls.omp_disregard_value_expr for
(first)private in target regions.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-lambda-3.C: Moved from
gcc/testsuite/g++.dg/gomp/ and fixed is-mapped handling.
* testsuite/libgomp.c++/target-lambda-1.C: Modify to also
also work without offloading.
* testsuite/libgomp.c++/firstprivate-1.C: New test.
* testsuite/libgomp.c++/firstprivate-2.C: New test.
* testsuite/libgomp.c++/private-1.C: New test.
* testsuite/libgomp.c++/private-2.C: New test.
* testsuite/libgomp.c++/target-lambda-4.C: New test.
* testsuite/libgomp.c++/use_device_ptr-1.C: New test.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/target-lambda-1.C: Moved to become a
run-time test under testsuite/libgomp.c++.
Co-authored-by: Tobias Burnus <tburnus@baylibre.com>
|
|
|
|
acc_{alloc,free,hostptr,deviceptr,memcpy_{to,from}_device*}
These routines map simply to the C counterpart and are meanwhile
defined in OpenACC 3.3. (There are additional routine changes,
including the Fortran addition of acc_attach/acc_detach, that
require more work than a simple addition of an interface and
are therefore excluded.)
libgomp/ChangeLog:
* libgomp.texi (OpenACC Runtime Library Routines): Document new 3.3
routines that simply map to their C counterpart.
* openacc.f90 (openacc): Add them.
* openacc_lib.h: Likewise.
* testsuite/libgomp.oacc-fortran/acc_host_device_ptr.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-memcpy.f90: New test.
* testsuite/libgomp.oacc-fortran/acc-memcpy-2.f90: New test.
* testsuite/libgomp.oacc-c-c++-common/lib-59.c: Crossref to f90 test.
* testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
|
|
|
|
The main 'arch' context selector for nvptx is, well, 'nvptx';
however, as 'nvptx64' is used as by LLVM, it makes sense
to support it as well.
Note that LLVM has: "The triple architecture can be one of
``nvptx`` (32-bit PTX) or ``nvptx64`` (64-bit PTX)."
GCC effectively only supports the 64bit variant (at least for
offloading). Thus, GCC's 'nvptx' is not quite the same as LLVM's.
The device-compiler part (nvptx_omp_device_kind_arch_isa) uses
TARGET_ABI64 such that nvptx64 is only defined with -m64.
gcc/ChangeLog:
* config/nvptx/gen-omp-device-properties.sh: Add 'nvptx64' to arch.
* config/nvptx/nvptx.cc (nvptx_omp_device_kind_arch_isa): Likewise.
libgomp/ChangeLog:
* libgomp.texi (OpenMP Context Selectors): Add 'nvptx64' as additional
'arch' value for nvptx.
|
|
|
|
Support for indirect calls to procedures/functions in offloaded target
regions is now available for C, C++ and Fortran.
2024-02-15 Kwok Cheung Yeung <kcyeung@baylibre.com>
libgomp/
* libgomp.texi (OpenMP 5.1): Mark indirect call support as fully
implemented.
|
|
target directive
2024-02-15 Kwok Cheung Yeung <kcyeung@baylibre.com>
gcc/fortran/
* dump-parse-tree.cc (show_attr): Handle omp_declare_target_indirect
attribute.
* f95-lang.cc (gfc_gnu_attributes): Add entry for 'omp declare
target indirect'.
* gfortran.h (symbol_attribute): Add omp_declare_target_indirect
field.
(struct gfc_omp_clauses): Add indirect field.
* openmp.cc (omp_mask2): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_clauses): Match indirect clause.
(OMP_DECLARE_TARGET_CLAUSES): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_declare_target): Check omp_device_type and apply
omp_declare_target_indirect attribute to symbol if indirect clause
active. Show warning if there are only device_type and/or indirect
clauses on the directive.
* trans-decl.cc (add_attributes_to_decl): Add 'omp declare target
indirect' attribute if symbol has indirect attribute set.
gcc/testsuite/
* gfortran.dg/gomp/declare-target-4.f90 (f1): Update expected warning.
* gfortran.dg/gomp/declare-target-indirect-1.f90: New.
* gfortran.dg/gomp/declare-target-indirect-2.f90: New.
libgomp/
* testsuite/libgomp.fortran/declare-target-indirect-1.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-2.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-3.f90: New.
|
|
|
|
targets [PR113448]
Two libgomp tests XPASS on Solaris (any non-Linux target actually) since
their introduction:
XPASS: libgomp.c/alloc-pinned-1.c execution test
XPASS: libgomp.c/alloc-pinned-2.c execution test
The problem is that the test just prints
OS unsupported
and exits successfully, while the test is XFAILed:
/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */
Fixed by aborting immediately after the message above in the non-Linux
case.
Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.
2024-02-02 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
libgomp:
PR testsuite/113448
* testsuite/libgomp.c/alloc-pinned-1.c [!__linux__] (CHECK_SIZE):
Call abort.
* testsuite/libgomp.c/alloc-pinned-2.c [!__linux__] (CHECK_SIZE):
Likewise.
|
|
|
|
2024-02-11 John David Anglin <danglin@gcc.gnu.org>
libgomp/ChangeLog:
PR libgomp/113843
* configure.tgt (hppa*-*-linux*): Define config_path.
|
|
|
|
We support a maximum of 50 threads on 32-bit hppa.
2024-02-01 John David Anglin <danglin@gcc.gnu.org>
libgomp/ChangeLog:
* testsuite/libgomp.c++/loop-3.C: Set num_threads to 50
on 32-bit hppa.
* testsuite/libgomp.c/omp-loop03.c: Likewise.
|
|
|
|
libgomp/ChangeLog:
* testsuite/libgomp.c/declare-variant-4.h: Use gfx1100/gfx1030
function not gfx90a for gfx1100/gfx1030 context selector.
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
|
|
|
|
The following avoids registering unsupported GCN offload devices
when iterating over available ones. With a Zen4 desktop CPU
you will have an IGPU (unspported) which will otherwise be made
available. This causes testcases like
libgomp.c-c++-common/non-rect-loop-1.c which iterate over all
decives to FAIL.
libgomp/
* plugin/plugin-gcn.c (suitable_hsa_agent_p): Filter out
agents with unsupported ISA.
|
|
The following makes the existing architecture support check work
instead of being optimized away (enum vs. -1). This avoids
later asserts when we assume such devices are never actually
used.
libgomp/
* plugin/plugin-gcn.c
(EF_AMDGPU_MACH::EF_AMDGPU_MACH_UNSUPPORTED): Add.
(isa_code): Return that instead of -1.
(GOMP_OFFLOAD_init_device): Adjust.
|
|
gcc/ChangeLog:
* config.gcc (amdgcn-*-*): Add gfx1030 and gfx1100 to
TM_MULTILIB_CONFIG.
* doc/install.texi (Configuration amdgcn-*-*): Mention gfx1030/gfx1100.
* doc/invoke.texi (AMD GCN Options): Add gfx1030 and gfx1100 to
-march/-mtune.
libgomp/ChangeLog:
* testsuite/libgomp.c/declare-variant-4.h: Add variant functions
for gfx1030 and gfx1100.
* testsuite/libgomp.c/declare-variant-4-gfx1030.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1100.c: New test.
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
|
|
This is enough to get gfx1030 and gfx1100 working; there are still some test
failures to investigate, and probably some tuning to do.
gcc/ChangeLog:
* config/gcn/gcn-opts.h (TARGET_PACKED_WORK_ITEMS): Add TARGET_RDNA3.
* config/gcn/gcn-valu.md (all_convert): New iterator.
(<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): New
define_expand, and rename the old one to ...
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): ... this.
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): Likewise, to ...
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): .. this.
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_shift<exec>): New.
* config/gcn/gcn.cc (gcn_global_address_p): Use "offsetbits" correctly.
(gcn_hsa_declare_function_name): Update the vgpr counting for gfx1100.
* config/gcn/gcn.md (<u>mulhisi3): Disable on RDNA3.
(<u>mulqihi3_scalar): Likewise.
libgcc/ChangeLog:
* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Handle RDNA3.
libgomp/ChangeLog:
* config/gcn/time.c (RTC_TICKS): Configure RDNA3.
(omp_get_wtime): Add RDNA3-compatible variant.
* plugin/plugin-gcn.c (max_isa_vgprs): Tune for gfx1030 and gfx1100.
Signed-off-by: Andrew Stubbs <ams@baylibre.com>
|
|
|
|
libgomp/ChangeLog:
* libgomp.texi (Runtime Library Routines): Document
omp_pause_resource, omp_pause_resource_all and
omp_target_memcpy{,_rect}{,_async}.
Co-authored-by: Sandra Loosemore <sandra@codesourcery.com>
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
|
|
|
|
Since r14-4734-g56ed1055b2f40ac162ae8d382280ac07a33f789f, GCC no longer
builds the Fiji (alias gfx803) libraries by default as support for it was
removed in ROCm 4.0 and will be removed in LLVM 18.
Thus, unless gfx803 is explicitly enabled, the following testcases will
fail to link as libgomp is not available for Fiji. Hence, this commit
xfails those testcases.
libgomp/ChangeLog:
* testsuite/libgomp.c/declare-variant-4-fiji.c: Xfail as fiji
support is no longer enabled by default.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: Likewise.
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
|
|
|
|
2024-01-20 John David Anglin <danglin@gcc.gnu.org>
libgomp/ChangeLog:
* testsuite/libgomp.fortran/alloc-comp-3.f90: Increase
timeout by 2 on hppa*-*-*.
|
|
hppa*-*-hpux* lacks necessary math functions.
2024-01-20 John David Anglin <danglin@gcc.gnu.org>
libgomp/ChangeLog:
* testsuite/libgomp.c/simd-math-1.c: Don't run on
hppa*-*-hpux*.
|
|
|
|
The following patch adds support for _BitInt iterators of OpenMP canonical
loops (with the preexisting limitation that when not using compile time
static scheduling the iterators in the library are at most unsigned long long
or signed long, so one can't in the runtime/dynamic/guided etc. cases iterate
more than what those types can represent, like is the case of e.g. __int128
iterators too) and the testcase also covers linear/reduction clauses for them.
2024-01-17 Jakub Jelinek <jakub@redhat.com>
PR middle-end/113409
* omp-general.cc (omp_adjust_for_condition): Handle BITINT_TYPE like
INTEGER_TYPE.
(omp_extract_for_data): Use build_bitint_type rather than
build_nonstandard_integer_type if either iter_type or loop->v type
is BITINT_TYPE.
* omp-expand.cc (expand_omp_for_generic,
expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): Handle
BITINT_TYPE like INTEGER_TYPE.
* testsuite/libgomp.c/bitint-1.c: New test.
|
|
|
|
This patch adds support for parsing general lvalues ("locator list item
types") for OpenMP "map", "to" and "from" clauses to the C front-end,
similar to the previously-posted patch for C++. Such syntax is permitted
for OpenMP 5.0 and above. It was previously posted for mainline here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609038.html
and for the og13 branch here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623355.html
2024-01-11 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-pretty-print.cc (c_pretty_printer::postfix_expression,
c_pretty_printer::expression): Add OMP_ARRAY_SECTION support.
gcc/c/
* c-parser.cc (c_parser_braced_init, c_parser_conditional_expression):
Don't allow OpenMP array section.
(c_parser_postfix_expression): Don't allow array section in statement
expression.
(c_parser_postfix_expression_after_primary): Add support for OpenMP
array section parsing.
(c_parser_expr_list): Don't allow OpenMP array section here.
(c_parser_omp_variable_list): Change ALLOW_DEREF parameter to
MAP_LVALUE. Support parsing of general lvalues in "map", "to" and
"from" clauses.
(c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to
MAP_LVALUE. Update call to c_parser_omp_variable_list.
(c_parser_oacc_data_clause): Update calls to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_reduction): Use OMP_ARRAY_SECTION tree node
instead of TREE_LIST for array sections.
(c_parser_omp_target): Allow GOMP_MAP_ATTACH.
* c-tree.h (c_omp_array_section_p): Add extern declaration.
(build_omp_array_section): Add prototype.
* c-typeck.cc (c_omp_array_section_p): Add flag.
(mark_exp_read): Support OMP_ARRAY_SECTION.
(build_omp_array_section): Add function.
(build_external_ref): Tweak error path for OpenMP array sections.
(handle_omp_array_sections_1): Use OMP_ARRAY_SECTION tree code instead
of TREE_LIST. Handle more kinds of expressions.
(c_oacc_check_attachments): Use OMP_ARRAY_SECTION instead of TREE_LIST
for array sections.
(c_finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST.
Check for supported expression types.
gcc/testsuite/
* gcc.dg/gomp/bad-array-section-c-1.c: New test.
* gcc.dg/gomp/bad-array-section-c-2.c: New test.
* gcc.dg/gomp/bad-array-section-c-3.c: New test.
* gcc.dg/gomp/bad-array-section-c-4.c: New test.
* gcc.dg/gomp/bad-array-section-c-5.c: New test.
* gcc.dg/gomp/bad-array-section-c-6.c: New test.
* gcc.dg/gomp/bad-array-section-c-7.c: New test.
* gcc.dg/gomp/bad-array-section-c-8.c: New test.
libgomp/
* libgomp.texi: C/C++ lvalues are supported now for map/to/from.
* testsuite/libgomp.c-c++-common/ind-base-4.c: New test.
* testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test.
|
|
|
|
My earlier change broke Solaris testing, because @FLOCK@ isn't substituted
just into libgomp/Makefile where it worked, but also the
testsuite/libgomp-site-extra.exp file where Make variables aren't present
and can't be substituted.
The following patch instead computes the absolute srcdir path and uses it
for FLOCK.
2024-01-10 Jakub Jelinek <jakub@redhat.com>
PR libgomp/113192
* configure.ac (FLOCK): Use $libgomp_abs_srcdir/testsuite/flock
instead of \$(abs_top_srcdir)/testsuite/flock.
* configure: Regenerated.
|
|
|
|
This patch supports "lvalue" parsing (or "locator list item type" parsing)
for several OpenMP clause types for C++, as required for OpenMP 5.0
and above.
This version has been rebased -- some things have changed around
template handling recently, e.g. removal of build_non_dependent_expr and
tsubst_copy. A new potential corner-case issue has shown up regarding
implicit mapping of references to pointer to pointers -- an interaction
with the post-review fixes/rework for the patch here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638602.html
Which fixed the (new) tests baseptrs-[6789].C. I've noted that for now in
the patch, and adjusted the baseptrs-[46].C tests slightly to accommodate.
2024-01-08 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_address_inspector): Remove static from get_origin
and maybe_unconvert_ref methods.
* c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION.
(c_omp_address_inspector::map_supported_p): Handle OMP_ARRAY_SECTION.
(c_omp_address_inspector::get_origin): Avoid dereferencing possibly
NULL type when processing template decls.
(c_omp_address_inspector::maybe_unconvert_ref): Likewise.
gcc/cp/
* constexpr.cc (potential_consant_expression_1): Handle
OMP_ARRAY_SECTION.
* cp-tree.h (grok_omp_array_section, build_omp_array_section): Add
prototypes.
* decl2.cc (grok_omp_array_section): New function.
* error.cc (dump_expr): Handle OMP_ARRAY_SECTION.
* parser.cc (cp_parser_new): Initialize parser->omp_array_section_p.
(cp_parser_statement_expr): Disallow array sections.
(cp_parser_postfix_open_square_expression): Support OMP_ARRAY_SECTION
parsing.
(cp_parser_parenthesized_expression_list, cp_parser_lambda_expression,
cp_parser_braced_list): Disallow array sections.
(cp_parser_omp_var_list_no_open): Remove ALLOW_DEREF parameter, add
MAP_LVALUE in its place. Support generalised lvalue parsing for
OpenMP map, to and from clauses. Use OMP_ARRAY_SECTION
code instead of TREE_LIST to represent OpenMP array sections.
(cp_parser_omp_var_list): Remove ALLOW_DEREF parameter, add MAP_LVALUE.
Pass to cp_parser_omp_var_list_no_open.
(cp_parser_oacc_data_clause): Update call to cp_parser_omp_var_list.
(cp_parser_omp_clause_map): Add sk_omp scope around
cp_parser_omp_var_list_no_open call.
* parser.h (cp_parser): Add omp_array_section_p field.
* pt.cc (tsubst, tsubst_copy, tsubst_omp_clause_decl,
tsubst_copy_and_build): Add OMP_ARRAY_SECTION support.
* semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections,
cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION
instead of TREE_LIST where appropriate. Handle more types of map
expression.
* typeck.cc (build_omp_array_section): New function.
gcc/
* gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been
processed out before gimplification.
* tree-pretty-print.cc (dump_generic_node): Support OMP_ARRAY_SECTION.
* tree.def (OMP_ARRAY_SECTION): New tree code.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update expected output.
* c-c++-common/gomp/target-enter-data-1.c: Update scan test.
* g++.dg/gomp/array-section-1.C: New test.
* g++.dg/gomp/array-section-2.C: New test.
* g++.dg/gomp/bad-array-section-1.C: New test.
* g++.dg/gomp/bad-array-section-2.C: New test.
* g++.dg/gomp/bad-array-section-3.C: New test.
* g++.dg/gomp/bad-array-section-4.C: New test.
* g++.dg/gomp/bad-array-section-5.C: New test.
* g++.dg/gomp/bad-array-section-6.C: New test.
* g++.dg/gomp/bad-array-section-7.C: New test.
* g++.dg/gomp/bad-array-section-8.C: New test.
* g++.dg/gomp/bad-array-section-9.C: New test.
* g++.dg/gomp/bad-array-section-10.C: New test.
* g++.dg/gomp/bad-array-section-11.C: New test.
* g++.dg/gomp/has_device_addr-non-lvalue-1.C: New test.
* g++.dg/gomp/pr67522.C: Update expected output.
* g++.dg/gomp/ind-base-3.C: New test.
* g++.dg/gomp/map-assignment-1.C: New test.
* g++.dg/gomp/map-inc-1.C: New test.
* g++.dg/gomp/map-lvalue-ref-1.C: New test.
* g++.dg/gomp/map-ptrmem-1.C: New test.
* g++.dg/gomp/map-ptrmem-2.C: New test.
* g++.dg/gomp/map-static-cast-lvalue-1.C: New test.
* g++.dg/gomp/map-ternary-1.C: New test.
* g++.dg/gomp/member-array-2.C: New test.
libgomp/
* testsuite/libgomp.c++/baseptrs-4.C: Remove commented-out cases that
now work.
* testsuite/libgomp.c++/baseptrs-6.C: New test.
* testsuite/libgomp.c++/ind-base-1.C: New test.
* testsuite/libgomp.c++/ind-base-2.C: New test.
* testsuite/libgomp.c++/lvalue-tofrom-1.C: New test.
* testsuite/libgomp.c++/lvalue-tofrom-2.C: New test.
* testsuite/libgomp.c++/map-comma-1.C: New test.
* testsuite/libgomp.c++/map-rvalue-ref-1.C: New test.
* testsuite/libgomp.c++/struct-ref-1.C: New test.
* testsuite/libgomp.c-c++-common/array-field-1.c: New test.
* testsuite/libgomp.c-c++-common/array-of-struct-1.c: New test.
* testsuite/libgomp.c-c++-common/array-of-struct-2.c: New test.
|
|
When flock program doesn't exist, libgomp configure attempts to
offer a fallback version using a perl script, but we weren't using
absolute filename to that, so it apparently failed to work correctly.
The following patch arranges for it to get the absolute filename.
Tested by John David in the PR.
2024-01-09 Jakub Jelinek <jakub@redhat.com>
PR libgomp/113192
* configure.ac (FLOCK): Use \$(abs_top_srcdir)/testsuite/flock
rather than $srcdir/testsuite/flock.
* configure: Regenerated.
|
|
|
|
../../../source-gcc/libgomp/plugin/plugin-gcn.c: In function ‘isa_hsa_name’:
../../../source-gcc/libgomp/plugin/plugin-gcn.c:1666:10: error: ‘EF_AMDGPU_MACH_AMDGCN_GFX1100’ undeclared (first use in this function); did you mean ‘EF_AMDGPU_MACH_AMDGCN_GFX1030’?
1666 | case EF_AMDGPU_MACH_AMDGCN_GFX1100:
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| EF_AMDGPU_MACH_AMDGCN_GFX1030
../../../source-gcc/libgomp/plugin/plugin-gcn.c:1666:10: note: each undeclared identifier is reported only once for each function it appears in
../../../source-gcc/libgomp/plugin/plugin-gcn.c: In function ‘isa_code’:
../../../source-gcc/libgomp/plugin/plugin-gcn.c:1711:12: error: ‘EF_AMDGPU_MACH_AMDGCN_GFX1100’ undeclared (first use in this function); did you mean ‘EF_AMDGPU_MACH_AMDGCN_GFX1030’?
1711 | return EF_AMDGPU_MACH_AMDGCN_GFX1100;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| EF_AMDGPU_MACH_AMDGCN_GFX1030
../../../source-gcc/libgomp/plugin/plugin-gcn.c: In function ‘max_isa_vgprs’:
../../../source-gcc/libgomp/plugin/plugin-gcn.c:1728:10: error: ‘EF_AMDGPU_MACH_AMDGCN_GFX1100’ undeclared (first use in this function); did you mean ‘EF_AMDGPU_MACH_AMDGCN_GFX1030’?
1728 | case EF_AMDGPU_MACH_AMDGCN_GFX1100:
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| EF_AMDGPU_MACH_AMDGCN_GFX1030
make[4]: *** [Makefile:813: libgomp_plugin_gcn_la-plugin-gcn.lo] Error 1
Fix-up for commit 52a2c659ae6c21f84b6acce0afcb9b93b9dc71a0
"GCN: Add pre-initial support for gfx1100".
libgomp/
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
'EF_AMDGPU_MACH_AMDGCN_GFX1100'.
|
|
This patch adds support for 2D/3D memory copies for omp_target_memcpy_rect
using AMD extensions to the HSA API. This is just the AMD GCN-specific
part of the following patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631001.html
2024-01-04 Julian Brown <julian@codesourcery.com>
libgomp/
* plugin/plugin-gcn.c (hsa_runtime_fn_info): Add
hsa_amd_memory_lock_fn, hsa_amd_memory_unlock_fn,
hsa_amd_memory_async_copy_rect_fn function pointers.
(init_hsa_runtime_functions): Add above functions, with
DLSYM_OPT_FN.
(GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New functions.
|
|
ROCm since 5.7.1 supports gfx1100 (RDNA3) cards. This commit adds support
for it, mostly by assuming gfx1100 behaves identical to gfx1030. Like gfx1030,
gfx1100 support is neither documented nor the build of the multilib enabled by
default.
But contrary to gfx1030, gfx1100 has a known issue causing some libraries not
to build, including newlib: The sdwa variant of v_mov_b32_sdwa is not supported
by the hardware but GCC current does generates this instruction.
This will be addressed in a later commit.
gcc/ChangeLog:
* config.gcc (amdgcn-*-amdhsa): Accept --with-arch=gfx1100.
* config/gcn/gcn-hsa.h (NO_XNACK): Add gfx1100:
(ASM_SPEC): Handle gfx1100.
* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1100.
(enum gcn_isa): Add ISA_RDNA3.
(TARGET_GFX1100, TARGET_RDNA2_PLUS, TARGET_RDNA3): Define.
* config/gcn/gcn-valu.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS.
* config/gcn/gcn.cc (gcn_option_override,
gcn_omp_device_kind_arch_isa, output_file_start): Handle gfx1100.
(gcn_global_address_p, gcn_addr_space_legitimate_address_p): Change
TARGET_RDNA2 to TARGET_RDNA2_PLUS.
(gcn_hsa_declare_function_name): Don't use '.amdhsa_reserve_flat_scratch'
with gfx1100.
* config/gcn/gcn.h (ASSEMBLER_DIALECT): Likewise.
(TARGET_CPU_CPP_BUILTINS): Define __RDNA3__, __gfx1030__ and
__gfx1100__.
* config/gcn/gcn.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS.
* config/gcn/gcn.opt (Enum gpu_type): Add gfx1100.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1100): Define.
(isa_has_combined_avgprs, main): Handle gfx1100.
* config/gcn/t-omp-device (isa): Add gfx1100.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (gcn_gfx1100_s): New const string.
(gcn_isa_name_len): Fix length.
(isa_hsa_name, isa_code, max_isa_vgprs): Handle gfx1100.
|