Age | Commit message (Collapse) | Author | Files | Lines |
|
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_depobj): Fix to alloc/ptr dummy
and for c_ptr.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/depend-4.f90: Add VALUE test, update scan test.
* gfortran.dg/gomp/depend-5.f90: Fix scan tree for -m32.
* gfortran.dg/gomp/depend-6.f90: New test.
(cherry picked from commit 4d74ea551734694c225643c4069b1b4d4d2b05ed)
|
|
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_clauses, gfc_trans_omp_depobj):
Depend on the proper addr, for ptr/alloc depend on pointee.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/depend-4.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/depend-4.f90: New test.
* gfortran.dg/gomp/depend-5.f90: New test.
(cherry picked from commit 3939c1b11279dc950d2f160eb940dd791f7b40f1)
|
|
PR fortran/104329
gcc/fortran/ChangeLog:
* openmp.c (resolve_omp_atomic): Defer extra-code assert after
other diagnostics.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/atomic-28.f90: New test.
(cherry picked from commit 9694f6121982668285a21020b55b44c3099f7042)
|
|
Related to r12-6208-gebc853deb7cc0487de9ef6e891a007ba853d1933
"libgomp: Fix GOMP_DEVICE_NUM_VAR stringification during offload image load"
That commit fixed an issue with omp_get_device_num() on gcn/nvptx that
resulted in having always the value 0.
This commit modifies the tests to iterate over all devices such that on a
multi-nonhost-device system it had detected that always-zero issue.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/target-45.c: Iterate over all devices.
* testsuite/libgomp.fortran/target10.f90: Likewise.
(cherry picked from commit be661959a6b6d8f9c3c8608a746789e7b2ec3ca4)
|
|
gcc/fortran/ChangeLog:
PR fortran/103576
* openmp.c (is_scalar_intrinsic_expr): Fix condition.
(resolve_omp_atomic): Fix/update checks, accept compare.
* trans-openmp.c (gfc_trans_omp_atomic): Handle compare.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Set Fortran support for atomic to 'Y'.
* testsuite/libgomp.fortran/atomic-19.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/atomic-25.f90: Remove sorry, fix + add checks.
* gfortran.dg/gomp/atomic-26.f90: Likewise.
* gfortran.dg/gomp/atomic-21.f90: New test.
(cherry picked from commit 494ebfa7c9aacaeb6ec1fccc47a0e49f31eb2bb8)
|
|
Implements moste of OpenMP 5.1 atomic extensions,
except that 'compare' is parsed but rejected during
resolution. (As the trans-openmp.c handling is missing.)
gcc/fortran/ChangeLog:
* dump-parse-tree.c (show_omp_clauses): Handle
weak/compare/fail clause.
* gfortran.h (gfc_omp_clauses): Add weak, compare, fail.
* openmp.c (enum omp_mask1, gfc_match_omp_clauses,
OMP_ATOMIC_CLAUSES): Update for new clauses.
(gfc_match_omp_atomic): Update for 5.1 atomic changes.
(is_conversion): Support widening in one go.
(is_scalar_intrinsic_expr): New.
(resolve_omp_atomic): Update for 5.1 atomic changes.
* parse.c (parse_omp_oacc_atomic): Update for compare.
* resolve.c (gfc_resolve_blocks): Update asserts.
* trans-openmp.c (gfc_trans_omp_atomic): Handle new clauses.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/atomic-2.f90: Move now supported code to ...
* gfortran.dg/gomp/atomic.f90: here.
* gfortran.dg/gomp/atomic-10.f90: New test.
* gfortran.dg/gomp/atomic-12.f90: New test.
* gfortran.dg/gomp/atomic-15.f90: New test.
* gfortran.dg/gomp/atomic-16.f90: New test.
* gfortran.dg/gomp/atomic-17.f90: New test.
* gfortran.dg/gomp/atomic-18.f90: New test.
* gfortran.dg/gomp/atomic-19.f90: New test.
* gfortran.dg/gomp/atomic-20.f90: New test.
* gfortran.dg/gomp/atomic-22.f90: New test.
* gfortran.dg/gomp/atomic-24.f90: New test.
* gfortran.dg/gomp/atomic-25.f90: New test.
* gfortran.dg/gomp/atomic-26.f90: New test.
libgomp/ChangeLog
* libgomp.texi (OpenMP 5.1): Update status.
(cherry picked from commit 689407ef916503b2f5a3c8c07fe7d5ab1913f956)
|
|
gcc/fortran/ChangeLog:
* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
teams.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/thread-limit-1.f90: New test.
(cherry picked from commit 82ec4cb3c43c7429be6b902d96770a6435fa068b)
|
|
OpenMP 5.1 says that thread_limit clause can also appear on target,
and similarly to teams should affect the thread-limit-var ICV.
On combined target teams, the clause goes to both.
We actually passed thread_limit internally on target already before,
but only used it for gcn/ptx offloading to hint how many threads should be
created and for ptx didn't set thread_limit_var in that case.
Similarly for host fallback.
Also, I found that we weren't copying the args array that contains encoded
thread_limit and num_teams clause for target (etc.) for async target.
2021-11-15 Jakub Jelinek <jakub@redhat.com>
gcc/
* gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT
to OMP_TARGET_CLAUSES if it isn't there already.
gcc/c-family/
* c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>:
Duplicate to both OMP_TARGET and OMP_TEAMS.
gcc/c/
* c-parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
gcc/cp/
* parser.c (OMP_TARGET_CLAUSE_MASK): Add
PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
libgomp/
* task.c (gomp_create_target_task): Copy args array as well.
* target.c (gomp_target_fallback): Add args argument.
Set gomp_icv (true)->thread_limit_var if thread_limit is present.
(GOMP_target): Adjust gomp_target_fallback caller.
(GOMP_target_ext): Likewise.
(gomp_target_task_fn): Likewise.
* config/nvptx/team.c (gomp_nvptx_main): Set
gomp_global_icv.thread_limit_var.
* testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.
(cherry picked from commit aea72386831c0c5672f55983034cc709b968daea)
|
|
gcc/fortran/ChangeLog:
* parse.c (decode_omp_directive): Fix permitting 'nowait' for some
combined directives, add missing 'omp end ... loop'.
(gfc_ascii_statement): Fix ST_OMP_END_TEAMS_LOOP result.
* openmp.c (resolve_omp_clauses): Add missing combined loop constructs
case values to the 'if(directive-name: ...)' check.
* trans-openmp.c (gfc_split_omp_clauses): Put nowait on target if
first leaf construct accepting it.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/unexpected-end.f90: Update dg-error.
* gfortran.dg/gomp/clauses-1.f90: New test.
* gfortran.dg/gomp/nowait-2.f90: New test.
* gfortran.dg/gomp/nowait-3.f90: New test.
(cherry picked from commit 48c6cac9caea1dc7c5f50ad3a736f6693e74a11b)
|
|
Fortran part to commit r12-5146-g48d7327f2aaf65
gcc/fortran/ChangeLog:
* gfortran.h (struct gfc_omp_clauses): Rename num_teams to
num_teams_upper, add num_teams_upper.
* dump-parse-tree.c (show_omp_clauses): Update to handle
lower-bound num_teams clause.
* frontend-passes.c (gfc_code_walker): Likewise
* openmp.c (gfc_free_omp_clauses, gfc_match_omp_clauses,
resolve_omp_clauses): Likewise.
* trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses,
gfc_trans_omp_target): Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/teams-1.f90: New test.
(cherry picked from commit 407eaad25f45ccba6e45e6f07d6c69c51cc567f3)
|
|
In OpenMP 5.1, num_teams clause can accept either one expression as before,
but it in that case changed meaning, rather than create <= expression
teams it is now create == expression teams. Or it accepts two expressions
separated by :, with the meaning that the first is low bound and second upper
bound on how many teams should be created. The other ways to set number of
teams are upper bounds with lower bound of 1.
The following patch does parsing of this for C/C++. For host teams, we
actually don't need to do anything further right now, we always create
(pretend to create) exactly the requested number of teams, so we can just
evaluate and throw away the lower bound for now.
For teams nested in target, we don't guarantee that though and further
work will be needed.
In particular, omplower now turns the teams part of:
struct S { S (); S (const S &); ~S (); int s; };
void bar (S &, S &);
int baz ();
_Pragma ("omp declare target to (baz)");
void
foo (void)
{
S a, b;
#pragma omp target private (a) map (b)
{
#pragma omp teams firstprivate (b) num_teams (baz ())
{
bar (a, b);
}
}
}
into:
retval.0 = baz ();
retval.1 = retval.0;
{
unsigned int retval.3;
struct S * D.2549;
struct S b;
retval.3 = (unsigned int) retval.1;
D.2549 = .omp_data_i->b;
S::S (&b, D.2549);
#pragma omp teams num_teams(retval.1) firstprivate(b) shared(a)
__builtin_GOMP_teams (retval.3, 0);
{
bar (&a, &b);
}
S::~S (&b);
#pragma omp return(nowait)
}
IMHO we want a new API, say GOMP_teams3 which will take 3 arguments
instead of 2 (the lower and upper bounds from num_teams and thread_limit)
and will return a bool whether it should do the teams body or not.
And, we should add right before outermost {} above
while (__builtin_GOMP_teams3 ((unsigned) retval.1, (unsigned) retval.1, 0))
and remove the __builtin_GOMP_teams call. The current function performs
exit equivalent (at least on NVPTX) which seems bad because that means
the destructors of e.g. private variables on target aren't invoked, and
at the current placement neither destructors of the already constructed
privatized variables in teams.
I'll do this next on the compiler side, but I'm afraid I'll need help
with the nvptx and amdgcn implementations. E.g. for nvptx, we won't be
able to use %ctaid.x . I think ideal would be to use a .shared
integer variable for the omp_get_team_num value, but I don't have any
experience with that, are .shared variables zero initialized by default,
or do they have random value at start? PTX docs say they aren't initializable.
2021-11-11 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_NUM_TEAMS_EXPR): Rename to ...
(OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR): ... this.
(OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR): Define.
* tree.c (omp_clause_num_ops): Increase num ops for
OMP_CLAUSE_NUM_TEAMS to 2.
* tree-pretty-print.c (dump_omp_clause): Print optional lower bound
for OMP_CLAUSE_NUM_TEAMS.
* gimplify.c (gimplify_scan_omp_clauses): Gimplify
OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR if non-NULL.
(optimize_target_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead
of OMP_CLAUSE_NUM_TEAMS_EXPR. Handle OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
* omp-low.c (lower_omp_teams): Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR
instead of OMP_CLAUSE_NUM_TEAMS_EXPR.
* omp-expand.c (expand_teams_call, get_target_arguments): Likewise.
gcc/c/
* c-parser.c (c_parser_omp_clause_num_teams): Parse optional
lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
OMP_CLAUSE_NUM_TEAMS_EXPR.
(c_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
gcc/cp/
* parser.c (cp_parser_omp_clause_num_teams): Parse optional
lower-bound and store it into OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR.
Use OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of
OMP_CLAUSE_NUM_TEAMS_EXPR.
(cp_parser_omp_target): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
* semantics.c (finish_omp_clauses): Handle
OMP_CLAUSE_NUM_TEAMS_LOWER_EXPR of OMP_CLAUSE_NUM_TEAMS clause.
* pt.c (tsubst_omp_clauses): Likewise.
(tsubst_expr): For OMP_CLAUSE_NUM_TEAMS evaluate before
combined target teams even lower-bound expression.
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Use
OMP_CLAUSE_NUM_TEAMS_UPPER_EXPR instead of OMP_CLAUSE_NUM_TEAMS_EXPR.
gcc/testsuite/
* c-c++-common/gomp/clauses-1.c (bar): Supply lower-bound expression
to half of the num_teams clauses.
* c-c++-common/gomp/num-teams-1.c: New test.
* c-c++-common/gomp/num-teams-2.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Supply lower-bound expression
to half of the num_teams clauses.
* g++.dg/gomp/attrs-2.C (bar): Likewise.
* g++.dg/gomp/num-teams-1.C: New test.
* g++.dg/gomp/num-teams-2.C: New test.
libgomp/
* testsuite/libgomp.c-c++-common/teams-1.c: New test.
(cherry picked from commit 48d7327f2aaf65e224f5f0793a65b950297f6c7f)
|
|
The teams construct only permits omp_get_num_teams and omp_get_team_num
as API call in strictly nested regions - check for it.
Additionally, for Fortran, using DECL_NAME does not show the mangled
name, hence, DECL_ASSEMBLER_NAME had to be used to.
Finally, 'target device(ancestor:1)' wrongly rejected non-API calls
as well.
PR middle-end/102972
gcc/ChangeLog:
* omp-low.c (omp_runtime_api_call): Use DECL_ASSEMBLER_NAME to get
internal Fortran name; new permit_num_teams arg to permit
omp_get_num_teams and omp_get_team_num.
(scan_omp_1_stmt): Update call to it, add missing call for
reverse offload, and check for strictly nested API calls in teams.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-device-ancestor-3.c: Add non-API
routine test.
* gfortran.dg/gomp/order-6.f90: Add missing bind(C).
* c-c++-common/gomp/teams-3.c: New test.
* gfortran.dg/gomp/teams-3.f90: New test.
* gfortran.dg/gomp/teams-4.f90: New test.
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/icv-3.c: Nest API calls inside
parallel construct.
* testsuite/libgomp.c-c++-common/icv-4.c: Likewise.
* testsuite/libgomp.c/target-3.c: Likewise.
* testsuite/libgomp.c/target-5.c: Likewise.
* testsuite/libgomp.c/target-6.c: Likewise.
* testsuite/libgomp.c/target-teams-1.c: Likewise.
* testsuite/libgomp.c/teams-1.c: Likewise.
* testsuite/libgomp.c/thread-limit-2.c: Likewise.
* testsuite/libgomp.c/thread-limit-3.c: Likewise.
* testsuite/libgomp.c/thread-limit-4.c: Likewise.
* testsuite/libgomp.c/thread-limit-5.c: Likewise.
* testsuite/libgomp.fortran/icv-3.f90: Likewise.
* testsuite/libgomp.fortran/icv-4.f90: Likewise.
* testsuite/libgomp.fortran/teams1.f90: Likewise.
(cherry picked from commit 948d461954f2642ca187f86c19d297ba7a86320f)
|
|
In cases where a program constructs its own deep-copying for arrays-of-pointers,
e.g:
#pragma omp target enter data map(to:level->vectors[:N])
for (i = 0; i < N; i++)
#pragma omp target enter data map(to:level->vectors[i][:N])
We need to treat the part of the array reference before the array section
as a base-pointer (here 'level->vectors[i]'), providing pointer-attachment
behavior.
This patch adds this inside handle_omp_array_sections(), tracing the whole
sequence of array dimensions, creating a whole base-pointer reference
iteratively using build_array_ref(). The conditions are that each of the
"absorbed" dimensions must be length==1, and the final reference must be
of pointer-type (so that pointer attachment makes sense).
Merged from:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590658.html
2022-02-24 Chung-Lin Tang <cltang@codesourcery.com>
gcc/c/ChangeLog:
* c-typeck.cc (handle_omp_array_sections): Add handling for
creating array-reference base-pointer attachment clause.
gcc/cp/ChangeLog:
* semantics.cc (handle_omp_array_sections): Add handling for
creating array-reference base-pointer attachment clause.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Add case for
attach/detach map kind for ARRAY_REF of POINTER_TYPE.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/target-enter-data-1.c: Adjust testcase.
libgomp/testsuite/ChangeLog:
* libgomp.c-c++-common/ptr-attach-2.c: New test.
|
|
This patch fixes a misparsing issue when encountering code like:
#pragma omp metadirective when {<selector_set>={...}: A)
#pragma omp metadirective when (<selector_set>={...}: B)
When called for the first metadirective, analyze_metadirective_body would
stop just before the colon in the second metadirective because it naively
assumes that the '}' marks the end of a code block.
The assertion for clauses to end parsing at the same point is now disabled
if a parse error has occurred during the parsing of the clause, since some
tokens may not be consumed if a parse error cuts parsing short.
2022-02-18 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/c/
* c-parser.c (c_parser_omp_construct): Move handling of
PRAGMA_OMP_METADIRECTIVE from here...
(c_parser_pragma): ...to here.
(analyze_metadirective_body): Check that the bracket nesting level
is also zero before stopping the adding of tokens on encountering a
close brace.
(c_parser_omp_metadirective): Modify function signature and update.
Do not assert on remaining tokens if there has been a parse error.
gcc/cp/
* parser.c (cp_parser_omp_construct): Move handling of
PRAGMA_OMP_METADIRECTIVE from here...
(cp_parser_pragma): ...to here.
(analyze_metadirective_body): Check that the bracket
nesting level is also zero before stopping the adding of tokens on
encountering a close brace.
(cp_parser_omp_metadirective): Modify function signature and update.
Do not assert on remaining tokens if there has been a parse error.
gcc/testsuite/
* c-c++-common/gomp/metadirective-1.c (f): Add test for
improperly nested metadirectives.
|
|
This adds a check for declarative OpenMP directives in metadirective
variants (already present in the C/C++ front-ends), and fixes an
ICE when an empty metadirective (i.e. just '!$omp metadirective')
is presented.
2022-02-11 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/fortran/
* gfortran.h (is_omp_declarative_stmt): New.
* openmp.c (match_omp_metadirective): Reject declarative OpenMP
directives with 'sorry'.
* parse.c (parse_omp_metadirective_body): Check that state stack head
is non-null before dereferencing.
(is_omp_declarative_stmt): New.
gcc/testsuite/
* gfortran.dg/gomp/metadirective-2.f90 (main): Test empty
metadirective.
|
|
This patch checks during parsing if a metadirective selector is both
resolvable and non-matching - if so, it is removed from further
consideration. This is both more efficient, and avoids spurious
syntax errors caused by considering combinations of selectors that
lead to invalid combinations of OpenMP directives, when that
combination would never arise in the first place.
This exposes another bug - when metadirectives that are not of the
begin-end variety are nested, we might have to drill up through
multiple layers of the state stack to reach the state for the
next statement. This is now fixed.
2022-02-11 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* omp-general.c (DELAY_METADIRECTIVES_AFTER_LTO): Check that cfun is
non-null before derefencing.
gcc/fortran/
* decl.c (gfc_match_end): Search for first previous state that is not
COMP_OMP_METADIRECTIVE.
* gfortran.h (gfc_skip_omp_metadirective_clause): Add prototype.
* openmp.c (match_omp_metadirective): Skip clause if
result of gfc_skip_omp_metadirective_clause is true.
* trans-openmp.c (gfc_trans_omp_set_selector): Add argument and
disable expression conversion if false.
(gfc_skip_omp_metadirective_clause): New.
gcc/testsuite/
* gfortran.dg/gomp/metadirective-8.f90: New.
|
|
Obviously it would be better if these reductions could be evaluated at compile
time, but this will avoid an ICE.
gcc/ChangeLog:
* config/gcn/gcn.c (gcn_expand_reduc_scalar): Use force_reg.
(cherry picked from commit d51cad0b840a14c66732cb6a166c11ddf55d18b2)
|
|
An extra comma in an error message causes failures in the Fortran tests for
declare variant, because the message differs from that expected.
2022-01-31 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/fortran/
* openmp.c (gfc_match_omp_context_selector_specification): Remove
extra comma in error message.
|
|
gcc/
* ChangeLog.omp: Update.
gcc/testsuite
* ChangeLog.omp: Update.
libgomp/
* ChangeLog.omp: Update.
|
|
'construct={target}' called directly
void f(void)
{
#pragma omp metadirective \
when (construct={target}: A) \
default (B)
...
}
...
{
#pragma omp target
f(); // Target call
f(); // Local call
}
With the OpenMP 5.0/5.1 specifications, we would expect A to be selected in
the metadirective when the target call is made, but B when f is called
directly outside of a target context. However, since GCC does not have
separate copies of f for local and target calls, and the construct selector
is static, it must be resolved one way or the other at compile-time (currently
in the favour of selecting A), which may be unexpected behaviour.
This patch attempts to detect the above situation, and will emit a warning
if found.
2022-01-28 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* gimplify.c (gimplify_omp_metadirective): Mark offloadable functions
containing metadirectives with 'construct={target}' in the selector.
* omp-general.c (omp_has_target_constructor_p): New.
* omp-general.h (omp_has_target_constructor_p): New prototype.
* omp-low.c (lower_omp_1): Emit warning if marked functions called
outside of a target context.
gcc/testsuite/
* c-c++-common/gomp/metadirective-4.c (main): Add expected warning.
* gfortran.dg/gomp/metadirective-4.f90 (test): Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/metadirective-2.c (main): Add
expected warning.
* testsuite/libgomp.fortran/metadirective-2.f90 (test): Likewise.
|
|
This issue was triggered after the patch extending syntax for component access
in map clauses in commit 0ab29cf0bb68960c1f87405f14b4fb2109254e2f.
In gimplify_scan_omp_clauses, the case for handling indirect accesses (which
creates firstprivate ptr and zero-length array section map for such decls) was
erroneously went into for non-pointer cases (here being the base struct decl),
so added the
appropriate checks there.
Added new testcase is a compile only test for the ICE. The original omptests
t-partial-struct test actually should not execute correctly, because for
map(t.s->a[:N]), map(t.s[:1]) is not implicitly mapped, thus the entire
offloaded access does not work as is (fixing that omptests test is out of
scope here).
2022-01-27 Chung-Lin Tang <cltang@codesourcery.com>
PR middle-end/103642
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Do not do indir_p handling
for non-pointer or non-reference-to-pointer cases.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pr103642.c: New test.
(cherry picked from commit 1c91b014923f418e0aab789c5cf57facf04bf266)
|
|
Fix ICE for cases like:
#pragma omp target update from(s[0].a[0:1])
where multiple ARRAY_REF nodes exist and require more than one peeling
during [c_]finish_omp_clauses.
PR c++/103705
gcc/c/ChangeLog:
* c-typeck.c (c_finish_omp_clauses): Also continue peeling off of
outer node for ARRAY_REFs.
gcc/cp/ChangeLog:
* semantics.c (finish_omp_clauses): Also continue peeling off of
outer node for ARRAY_REFs.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pr103705.c: New test.
(cherry picked from commit cd7484d05cd4b7a9d741fe8bf6c4525406ed7620)
|
|
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New
type.
* omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New builtin.
* omp-general.c (omp_context_selector_matches): Handle 'target_device'
selector set.
(omp_dynamic_cond): Generate expression tree for 'target_device'
selector set.
(omp_context_compute_score): Handle selectors in 'target_device' set.
gcc/c/
* c-parser.c (omp_target_device_selectors): New.
(c_parser_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(c_parser_omp_context_selector_specification): Handle 'target_device'
selector set.
gcc/cp/
* parser.c (omp_target_device_selectors): New.
(cp_parser_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(cp_parser_omp_context_selector_specification): Handle 'target_device'
selector set.
gcc/fortran/
* openmp.c (omp_target_device_selectors): New.
(gfc_match_omp_context_selector): Accept 'target_device' selector set.
Treat 'device_num' selector as expression.
(gfc_match_omp_context_selector_specification): Handle 'target_device'
selector set.
* types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New type.
gcc/testsuite/
* c-c++-common/gomp/metadirective-7.c: New.
* gfortran.dg/gomp/metadirective-7.f90: New.
libgomp/
* Makefile.am (libgomp_la_SOURCES): Add selector.c.
* Makefile.am: Regenerate.
* config/gcn/selector.c: New.
* config/linux/selector.c: New.
* config/linux/x86/selector.c: New.
* config/nvptx/selector.c: New.
* libgomp-plugin.h (GOMP_OFFLOAD_evaluate_device): New.
* libgomp.h (struct gomp_device_descr): Add evaluate_device_func field.
* libgomp.map (GOMP_5.1): Add GOMP_evaluate_target_device.
* libgomp_g.h (GOMP_evaluate_current_device): New.
(GOMP_evaluate_target_device): New.
* oacc-host.c (host_evaluate_device): New.
(host_openacc_exec): Initialize evaluate_device_func field to
host_evaluate_device.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_evaluate_device): New.
* plugin/plugin-nvptx.c (struct ptx_device): Add compute_major and
compute_minor fields.
(nvptx_open_device): Read compute capability information from device.
(CHECK_ISA): New macro.
(GOMP_OFFLOAD_evaluate_device): New.
* selector.c: New.
* target.c (GOMP_evaluate_target_device): New.
(gomp_load_plugin_for_device): Load evaulate_device plugin function.
* testsuite/libgomp.c-c++-common/metadirective-5.c: New testcase.
* testsuite/libgomp.fortran/metadirective-5.f90: New testcase.
|
|
Fix regressions introduced by block/statement skipping.
If user condition selector is constant, do not return it as a dynamic
selector.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/c/
* c-parser.c (c_parser_skip_to_end_of_block_or_statement): Track
bracket depth separately from nesting depth.
gcc/cp/
* parser.c (cp_parser_skip_to_end_of_statement): Revert.
(cp_parser_skip_to_end_of_block_or_statement): Track bracket depth
separately from nesting depth.
gcc/
* omp-general.c (omp_dynamic_cond): Do not return user condition if
constant.
|
|
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/testsuite/
* c-c++-common/gomp/metadirective-1.c: New.
* c-c++-common/gomp/metadirective-2.c: New.
* c-c++-common/gomp/metadirective-3.c: New.
* c-c++-common/gomp/metadirective-4.c: New.
* c-c++-common/gomp/metadirective-5.c: New.
* c-c++-common/gomp/metadirective-6.c: New.
* gcc.dg/gomp/metadirective-1.c: New.
* gfortran.dg/gomp/metadirective-1.f90: New.
* gfortran.dg/gomp/metadirective-2.f90: New.
* gfortran.dg/gomp/metadirective-3.f90: New.
* gfortran.dg/gomp/metadirective-4.f90: New.
* gfortran.dg/gomp/metadirective-5.f90: New.
* gfortran.dg/gomp/metadirective-6.f90: New.
libgomp/
* testsuite/libgomp.c-c++-common/metadirective-1.c: New.
* testsuite/libgomp.c-c++-common/metadirective-2.c: New.
* testsuite/libgomp.c-c++-common/metadirective-3.c: New.
* testsuite/libgomp.c-c++-common/metadirective-4.c: New.
* testsuite/libgomp.fortran/metadirective-1.f90: New.
* testsuite/libgomp.fortran/metadirective-2.f90: New.
* testsuite/libgomp.fortran/metadirective-3.f90: New.
* testsuite/libgomp.fortran/metadirective-4.f90: New.
|
|
This adds support for parsing OpenMP metadirectives in the Fortran front end.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* omp-general.c (omp_check_context_selector): Revert string length
check.
(omp_context_name_list_prop): Likewise.
gcc/fortran/
* decl.c (gfc_match_end): Handle COMP_OMP_METADIRECTIVE and
COMP_OMP_BEGIN_METADIRECTIVE.
* dump-parse-tree.c (show_omp_node): Handle EXEC_OMP_METADIRECTIVE.
(show_code_node): Handle EXEC_OMP_METADIRECTIVE.
* gfortran.h (enum gfc_statement): Add ST_OMP_METADIRECTIVE,
ST_OMP_BEGIN_METADIRECTIVE and ST_OMP_END_METADIRECTIVE.
(struct gfc_omp_metadirective_clause): New structure.
(gfc_get_omp_metadirective_clause): New macro.
(struct gfc_st_label): Add omp_region field.
(enum gfc_exec_op): Add EXEC_OMP_METADIRECTIVE.
(struct gfc_code): Add omp_metadirective_clauses field.
(gfc_free_omp_metadirective_clauses): New prototype.
(match_omp_directive): New prototype.
* io.c (format_asterisk): Initialize omp_region field.
* match.h (gfc_match_omp_begin_metadirective): New prototype.
(gfc_match_omp_metadirective): New prototype.
* openmp.c (gfc_match_omp_eos): Match ')' in context selectors.
(gfc_free_omp_metadirective_clauses): New.
(gfc_match_omp_clauses): Remove context_selector argument. Rely on
gfc_match_omp_eos to match end of clauses.
(match_omp): Remove extra argument to gfc_match_omp_clauses.
(gfc_match_omp_context_selector): Remove extra argument to
gfc_match_omp_clauses. Set gfc_matching_omp_context_selector
before call to gfc_match_omp_clauses and reset after.
(gfc_match_omp_context_selector_specification): Modify to take a
gfc_omp_set_selector** argument.
(gfc_match_omp_declare_variant): Pass set_selectors to
gfc_match_omp_context_selector_specification.
(match_omp_metadirective): New.
(gfc_match_omp_begin_metadirective): New.
(gfc_match_omp_metadirective): New.
(resolve_omp_metadirective): New.
(gfc_resolve_omp_directive): Handle EXEC_OMP_METADIRECTIVE.
* parse.c (gfc_matching_omp_context_selector): New variable.
(gfc_in_metadirective_body): New variable.
(gfc_omp_region_count): New variable.
(decode_omp_directive): Match 'begin metadirective',
'end metadirective' and 'metadirective'.
(match_omp_directive): New.
(case_omp_structured_block): New.
(case_omp_do): New.
(gfc_ascii_statement): Handle metadirective statements.
(gfc_omp_end_stmt): New.
(parse_omp_do): Delegate to gfc_omp_end_stmt.
(parse_omp_structured_block): Delegate to gfc_omp_end_stmt. Handle
ST_OMP_END_METADIRECTIVE.
(parse_omp_metadirective_body): New.
(parse_executable): Delegate to case_omp_structured_block and
case_omp_do. Return after one statement if compiling regular
metadirective. Handle metadirective statements.
(gfc_parse_file): Reset gfc_omp_region_count,
gfc_in_metadirective_body and gfc_matching_omp_context_selector.
* parse.h (enum gfc_compile_state): Add COMP_OMP_METADIRECTIVE and
COMP_OMP_BEGIN_METADIRECTIVE.
(gfc_omp_end_stmt): New prototype.
(gfc_matching_omp_context_selector): New declaration.
(gfc_in_metadirective_body): New declaration.
(gfc_omp_region_count): New declaration.
* resolve.c (gfc_resolve_code): Handle EXEC_OMP_METADIRECTIVE.
* st.c (gfc_free_statement): Handle EXEC_OMP_METADIRECTIVE.
* symbol.c (compare_st_labels): Take omp_region into account.
(gfc_get_st_labels): Incorporate omp_region into label.
* trans-decl.c (gfc_get_label_decl): Add omp_region into translated
label name.
* trans-openmp.c (gfc_trans_omp_directive): Handle
EXEC_OMP_METADIRECTIVE.
(gfc_trans_omp_set_selector): Hoist code from...
(gfc_trans_omp_declare_variant): ...here.
(gfc_trans_omp_metadirective): New.
* trans-stmt.h (gfc_trans_omp_metadirective): New prototype.
* trans.c (trans_code): Handle EXEC_OMP_METADIRECTIVE.
|
|
This adds support for parsing OpenMP metadirectives in the C++ front end.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/cp/
* parser.c (cp_parser_skip_to_end_of_statement): Handle parentheses.
(cp_parser_skip_to_end_of_block_or_statement): Likewise.
(cp_parser_omp_context_selector): Add extra argument. Allow
non-constant expressions.
(cp_parser_omp_context_selector_specification): Add extra argument and
propagate to cp_parser_omp_context_selector.
(analyze_metadirective_body): New.
(cp_parser_omp_metadirective): New.
(cp_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE.
(cp_parser_pragma): Handle PRAGMA_OMP_METADIRECTIVE.
|
|
This patch adds support for streaming metadirective Gimple statements during
LTO, and adds a metadirective expansion pass that runs after LTO. This is
required for metadirectives with selectors that can only be resolved from
within the accel compiler.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* Makefile.in (OBJS): Add omp-expand-metadirective.o.
* gimple-streamer-in.c (input_gimple_stmt): Add case for
GIMPLE_OMP_METADIRECTIVE. Handle metadirective labels.
* gimple-streamer-out.c (output_gimple_stmt): Likewise.
* omp-expand-metadirective.cc: New.
* passes.def: Add pass_omp_expand_metadirective.
* tree-pass.h (make_pass_omp_expand_metadirective): New prototype.
|
|
Gimplification
This adds support for resolving metadirectives according to the OpenMP 5.1
specification. The variants are sorted by score, then gathered into a list
of dynamic replacement candidates. The metadirective is then expanded into
a sequence of 'if..else' statements to test the dynamic selector and execute
the variant if the selector is satisfied.
If any of the selectors in the list are unresolvable, GCC will give up on
resolving the metadirective and try again later.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* gimplify.c (expand_omp_metadirective): New.
* omp-general.c: Include tree-pretty-print.h.
(DELAY_METADIRECTIVES_AFTER_LTO): New macro.
(omp_context_selector_matches): Delay resolution of selectors. Allow
non-constant expressions.
(omp_dynamic_cond): New.
(omp_dynamic_selector_p): New.
(sort_variant): New.
(omp_get_dynamic_candidates): New.
(omp_resolve_metadirective): New.
(omp_resolve_metadirective): New.
* omp-general.h (struct omp_metadirective_variant): New.
(omp_resolve_metadirective): New prototype.
gcc/c-family/
* c-omp.c (c_omp_expand_metadirective_r): New.
(c_omp_expand_metadirective): New.
|
|
This adds a new Gimple statement type GIMPLE_OMP_METADIRECTIVE, which
represents the metadirective in Gimple. In high Gimple, the statement
contains the body of the directive variants, whereas in low Gimple, it
only contains labels to the bodies.
This patch adds support for converting metadirectives from tree to Gimple
form, and handling of the Gimple form (Gimple lowering, OpenMP lowering
and expansion, inlining, SSA handling etc).
Metadirectives should be resolved before they reach the back-end, otherwise
the compiler will crash as GCC does not know how to convert metadirective
Gimple statements to RTX.
2022-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* gimple-low.c (lower_omp_metadirective): New.
(lower_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
* gimple-pretty-print.c (dump_gimple_omp_metadirective): New.
(pp_gimple_stmt_1): Handle GIMPLE_OMP_METADIRECTIVE.
* gimple-walk.c (walk_gimple_op): Handle GIMPLE_OMP_METADIRECTIVE.
(walk_gimple_stmt): Likewise.
* gimple.c (gimple_alloc_omp_metadirective): New.
(gimple_build_omp_metadirective): New.
(gimple_build_omp_metadirective_variant): New.
* gimple.def (GIMPLE_OMP_METADIRECTIVE): New.
(GIMPLE_OMP_METADIRECTIVE_VARIANT): New.
* gimple.h (gomp_metadirective_variant): New.
(gomp_metadirective): New.
(is_a_helper <gomp_metadirective *>::test): New.
(is_a_helper <gomp_metadirective_variant *>::test): New.
(is_a_helper <const gomp_metadirective *>::test): New.
(is_a_helper <const gomp_metadirective_variant *>::test): New.
(gimple_alloc_omp_metadirective): New prototype.
(gimple_build_omp_metadirective): New prototype.
(gimple_build_omp_metadirective_variant): New prototype.
(gimple_has_substatements): Add GIMPLE_OMP_METADIRECTIVE case.
(gimple_has_ops): Add GIMPLE_OMP_METADIRECTIVE.
(gimple_omp_metadirective_label): New.
(gimple_omp_metadirective_set_label): New.
(gimple_omp_metadirective_variants): New.
(gimple_omp_metadirective_set_variants): New.
(CASE_GIMPLE_OMP): Add GIMPLE_OMP_METADIRECTIVE.
* gimplify.c (is_gimple_stmt): Add OMP_METADIRECTIVE.
(expand_omp_metadirective): New.
(gimplify_omp_metadirective): New.
(gimplify_expr): Add case for OMP_METADIRECTIVE.
* gsstruct.def (GSS_OMP_METADIRECTIVE): New.
(GSS_OMP_METADIRECTIVE_VARIANT): New.
* omp-expand.c (build_omp_regions_1): Handle GIMPLE_OMP_METADIRECTIVE.
(omp_make_gimple_edges): Likewise.
* omp-low.c (struct omp_context): Add next_clone field.
(new_omp_context): Initialize next_clone field.
(clone_omp_context): New.
(delete_omp_context): Delete clone contexts.
(scan_omp_metadirective): New.
(scan_omp_1_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
(lower_omp_metadirective): New.
(lower_omp_1): Handle GIMPLE_OMP_METADIRECTIVE.
* tree-cfg.c (cleanup_dead_labels): Handle GIMPLE_OMP_METADIRECTIVE.
(gimple_redirect_edge_and_branch): Likewise.
* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_METADIRECTIVE.
(estimate_num_insns): Likewise.
* tree-pretty-print.c (dump_generic_node): Handle OMP_METADIRECTIVE.
* tree-ssa-operands.c (parse_ssa_operands): Handle
GIMPLE_OMP_METADIRECTIVE.
|
|
This patch implements parsing for the OpenMP metadirective introduced in
OpenMP 5.0. Metadirectives are parsed into an OMP_METADIRECTIVE node,
with the variant clauses forming a chain accessible via
OMP_METADIRECTIVE_CLAUSES. Each clause contains the context selector
and tree for the variant.
User conditions in the selector are now permitted to be non-constant when
used in metadirectives as specified in OpenMP 5.1.
2021-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* omp-general.c (omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New stub function.
* omp-general.h (struct omp_metadirective_variant): New.
(omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New prototype.
* tree.def (OMP_METADIRECTIVE): New.
* tree.h (OMP_METADIRECTIVE_CLAUSES): New macro.
gcc/c/
* c-parser.c (c_parser_skip_to_end_of_block_or_statement): Handle
parentheses in statement.
(c_parser_omp_metadirective): New prototype.
(c_parser_omp_context_selector): Add extra argument. Allow
non-constant expressions.
(c_parser_omp_context_selector_specification): Add extra argument and
propagate it to c_parser_omp_context_selector.
(analyze_metadirective_body): New.
(c_parser_omp_metadirective): New.
(c_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE.
gcc/c-family
* c-common.h (enum c_omp_directive_kind): Add C_OMP_DIR_META.
(c_omp_expand_metadirective): New prototype.
* c-gimplify.c (genericize_omp_metadirective_stmt): New.
(c_genericize_control_stmt): Handle OMP_METADIRECTIVE tree nodes.
* c-omp.c (omp_directives): Classify metadirectives as C_OMP_DIR_META.
(c_omp_expand_metadirective): New stub function.
* c-pragma.c (omp_pragmas): Add entry for metadirective.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_METADIRECTIVE.
|
|
This makes it possible to apply GCCs stable sort algorithm to vec<>
and also use it with the qsort_r compatible interface.
2021-06-10 Richard Biener <rguenther@suse.de>
* system.h (gcc_stablesort_r): Declare.
* sort.cc (gcc_sort_r): Support stable sort.
(gcc_stablesort_r): Define.
* vec.h (vec<>::stablesort): Add.
(cherry-picked from commit 367f52dcc24045b072aeb26bc301a2980b39241f)
|
|
gfc_finish_var_decl was confused by the undocumented overloading of
the proc_name field in struct gfc_namespace to contain iterator
variables for the OpenMP AFFINITY clause, causing it to insert the
decls in the wrong scope. This patch adds a new distinct field to
hold these variables.
2022-01-23 Sandra Loosemore <sandra@codesourcery.com>
PR fortran/103695
PR fortran/102621
Backport from mainline:
2022-01-20 Sandra Loosemore <sandra@codesourcery.com>
gcc/fortran
* gfortran.h (struct gfc_namespace) Add omp_affinity_iterator
field.
* dump-parse-tree.c (show_iterator): Use it.
* openmp.c (gfc_match_iterator): Likewise.
(resolve_omp_clauses): Likewise.
* trans-decl.c (gfc_finish_var_decl): Likewise.
* trans-openmp.c (handle_iterator): Likewise.
gcc/testsuite/
* gfortran.dg/gomp/affinity-clause-3.f90: Adjust pattern.
* gfortran.dg/gomp/pr102621.f90: New.
* gfortran.dg/gomp/pr103695.f90: New.
|
|
Currently omp_get_device_num does not work on gcn targets with more than one
offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in
icv-device.c and thus "__gomp_device_num" is not visible in the offload image.
This patch removes "static" such that "__gomp_device_num" is now part of the
offload image and can now be found in GOMP_OFFLOAD_load_image in the plugin.
This is not an issue for nvptx. There, "__gomp_device_num" is in the offload
image even with "static".
libgomp/ChangeLog:
* config/gcn/icv-device.c: Make GOMP_DEVICE_NUM_VAR public (remove
"static") to make the device num available in the offload image.
(cherry picked from commit 0bd247bbbe4cf396173f09eeec37e116e98f8471)
|
|
In the patch that implemented omp_get_device_num(), there was an error where
the stringification of GOMP_DEVICE_NUM_VAR, which is the macro expanding to
the actual symbol used, was erroneously using the STRINGX() macro in the
libgomp offload image symbol search, and expansion of the variable name
string through the additional layer of preprocessor symbol was not properly
achieved.
This patch fixes this by changing to properly use XSTRING(), also from
include/symcat.h.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Change uses of STRINGX
into XSTRING when looking for GOMP_DEVICE_NUM_VAR in offload image.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
(cherry picked from commit fbb592407c9dd244b4cea086cbb90d7bd0bf60bb)
|
|
After the PR90030 patch, which removes the universal casting of all Fortran
array pointers to 'c_char*', a Fortran descriptor based array passed into an
affinity() clause now looks like:
- #pragma omp task private(i) shared(b) affinity(*(c_char *) a.data)
+ #pragma omp task private(i) shared(b) affinity(*(integer(kind=4)[0:] * restrict) a.data)
The 'integer(kind=4)[0:]' incomplete type appears to be causing ICE during
gimplify_expr() due to 'is_gimple_val, fb_rvalue'. The ICE appears to be fixed
just by adjusting to 'is_gimple_lvalue, fb_lvalue'. Considering the use of the
affinity() clause, which should be specifying the location of a particular
object in memory, this probably makes sense.
gcc/ChangeLog:
PR middle-end/103643
* gimplify.c (gimplify_omp_affinity): Adjust gimplify_expr of entire
OMP_CLAUSE_DECL to use 'is_gimple_lvalue, fb_lvalue'
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/pr103643.f90: New test.
(cherry picked from commit 62c8b21d48ab6012ddc50529a39071d902dba31a)
|
|
Up to now the libgomp GCN plugin has been finding the offload variables
by using a symbol lookup, but the AMD runtime requires that the symbols are
global for that to work. This was ensured by mkoffload as a post-procssing
step, but the LLVM 13 assembler no longer accepts this in the case where the
variable was previously declared differently.
This patch switches to locating the symbols directly from the
offload_var_table, which means that only one symbol needs to be forced
global.
This changes breaks the libgomp image compatibility so GOMP_VERSION_GCN has
also been bumped.
gcc/ChangeLog:
* config/gcn/mkoffload.c (process_asm): Process the variable table
completely differently.
(process_obj): Encode the varaible data differently.
include/ChangeLog:
* gomp-constants.h (GOMP_VERSION_GCN): Bump.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (struct gcn_image_desc): Remove global_variables.
(GOMP_OFFLOAD_load_image): Locate the offload variables via the
table, not individual symbols.
(cherry picked from commit 4a87a8e4b13e979e7c8a626a8f4082715a48e21e)
|
|
This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.
The use of the PTX dynamic_smem_size feature means that the minimum version
requirement is now bumped to 4.1 (still old at this point).
libgomp/ChangeLog:
* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(dynamic_smem_size): New constants.
(omp_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC.
Implement fall-backs for predefined allocators.
(omp_realloc): Use MEMSPACE_REALLOC.
Implement fall-backs for predefined allocators.
* config/nvptx/team.c (__nvptx_lowlat_heap_root): New variable.
(__nvptx_lowlat_pool): New asm varaible.
(gomp_nvptx_main): Initialize the low-latency heap.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/allocators-1.c: New test.
* testsuite/libgomp.c/allocators-2.c: New test.
* testsuite/libgomp.c/allocators-3.c: New test.
* testsuite/libgomp.c/allocators-4.c: New test.
* testsuite/libgomp.c/allocators-5.c: New test.
* testsuite/libgomp.c/allocators-6.c: New test.
|
|
gcc/ChangeLog:
* config/nvptx/nvptx-opts.h (ptx_version): Change PTX_VERSION_3_1 to
PTX_VERSION_4_1.
* config/nvptx/nvptx.c (nvptx_file_start): Bump minimum PTX version
to 4.1.
* config/nvptx/nvptx.opt (ptx_version): Add 4.1. Change default.
doc/invoke.texi: -mptx default is now 4.1.
|
|
There's no need to reject the dynamic_allocators requires directive because
we actually do support the feature, and it doesn't have to actually "do"
anything.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_requires): Don't "sorry" dynamic_allocators.
gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_requires): Don't "sorry" dynamic_allocators.
gcc/fortran/ChangeLog:
* openmp.c (gfc_match_omp_requires): Don't "sorry" dynamic_allocators.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/requires-8.f90: Reinstate dynamic allocators
requirement.
|
|
Fix issue with the Fortran front-end when mapping arrays: when creating the
data MEM_REF for the map clause, there was a convention of casting the
referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).
This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target.
This patch fixes this by removing the char_type_node pointer converts, and
adding gcc_asserts to ensure POINTER_TYPE_P (TREE_TYPE (ptr)).
PR fortran/90030
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
to char_type_node, add gcc_assert of POINTER_TYPE_P.
(gfc_trans_omp_array_section): Likewise.
(gfc_trans_omp_clauses): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/goacc/finalize-1.f: Adjust scan test.
* gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
* gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-4.f90: Likewise.
* gfortran.dg/gomp/defaultmap-5.f90: Likewise.
* gfortran.dg/gomp/defaultmap-6.f90: Likewise.
* gfortran.dg/gomp/map-3.f90: Likewise.
* gfortran.dg/gomp/pr78260-2.f90: Likewise.
* gfortran.dg/gomp/pr78260-3.f90: Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
* testsuite/libgomp.fortran/pr90030.f90: New test.
(cherry picked from commit 1ac7a8c9e4798d352eb8c64905dd38086af4e1cd)
|
|
This patch fixes a case of setting array low-bounds, found for particular uses
of SOURCE=/MOLD=. This adjusts the relevant part in gfc_trans_allocate() to
set e3_has_nodescriptor only for non-named arrays.
2021-12-03 Tobias Burnus <tobias@codesourcery.com>
gcc/fortran/ChangeLog:
* trans-stmt.c (gfc_trans_allocate): Set e3_has_nodescriptor to true
only for non-named arrays.
gcc/testsuite/ChangeLog:
* gfortran.dg/allocate_with_source_26.f90: Adjust testcase.
* gfortran.dg/allocate_with_mold_4.f90: New testcase.
(cherry picked from commit 6262e3a22b3d86afc116480bc59a7bb30b0cfd40)
|
|
contrib/
* download_prerequisites: Update to gmp-6.2.1, mpfr-4.1.0,
mpc-1.2.1 and isl-0.24.
* prerequisites.md5: Update hash.
* prerequisites.sha512: Likewise.
(cherry picked from commit be60f80247feb72b47af62cda66c82a0fa6c1cdc)
|
|
Adjust tests to changed expectations with the new Graphite-based
"kernels" handling.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90: Adjust.
* testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90: Removed.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/acc-icf.c: Adjust.
* c-c++-common/goacc/cache-3-1.c: Adjust.
* c-c++-common/goacc/classify-kernels-unparallelized-graphite.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Adjust.
* c-c++-common/goacc/classify-serial.c: Adjust.
* c-c++-common/goacc/if-clause-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-2.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Adjust.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Adjust.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: Adjust.
* c-c++-common/goacc/kernels-loop-3.c: Adjust.
* c-c++-common/goacc/loop-2-kernels.c: Adjust.
* c-c++-common/goacc/nested-reductions-2-parallel.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-loops.c: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-straight-line.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-combined-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-conditional-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-auto.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loop-independent_seq.c: Adjust.
* c-c++-common/goacc/note-parallelism-kernels-loops.c: Adjust.
* c-c++-common/goacc/routine-1.c: Adjust.
* c-c++-common/goacc/routine-level-of-parallelism-2.c: Adjust.
* c-c++-common/goacc/routine-nohost-1.c: Adjust.
* c-c++-common/goacc/uninit-copy-clause.c: Adjust.
* gcc.dg/goacc/loop-processing-1.c: Adjust.
* gcc.dg/goacc/nested-function-1.c: Adjust.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust.
* gfortran.dg/goacc/classify-kernels.f95: Adjust.
* gfortran.dg/goacc/classify-parallel.f95: Adjust.
* gfortran.dg/goacc/classify-routine.f95: Adjust.
* gfortran.dg/goacc/classify-serial.f95: Adjust.
* gfortran.dg/goacc/common-block-3.f90: Adjust.
* gfortran.dg/goacc/gang-static.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-1.f95: Adjust.
* gfortran.dg/goacc/kernels-decompose-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Adjust.
* gfortran.dg/goacc/kernels-loop-inner.f95: Adjust.
* gfortran.dg/goacc/kernels-loop.f95: Adjust.
* gfortran.dg/goacc/kernels-tree.f95: Adjust.
* gfortran.dg/goacc/loop-2-kernels.f95: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-2.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-3.f90: Adjust.
* gfortran.dg/goacc/loop-auto-transfer-4.f90: Adjust.
* gfortran.dg/goacc/nested-function-1.f90: Adjust.
* gfortran.dg/goacc/nested-reductions-2-parallel.f90: Adjust.
* gfortran.dg/goacc/pr72741.f90: Adjust.
* gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust.
* gfortran.dg/goacc/private-predetermined-kernels-1.f95: Adjust.
* gfortran.dg/goacc/routine-module-mod-1.f90: Adjust.
* gfortran.dg/goacc/uninit-copy-clause.f95: Adjust.
* c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c: Removed.
|
|
It seems that the check that rejects loops without data references is
only included to avoid handling non-profitable loops. Including those
loops in Graphite's analysis enables more consistent diagnostic
messages in OpenACC "kernels" code and does not introduce any
testsuite regressions. If executing Graphite on loops without
data references leads to noticeable compile time slow-downs for
non-OpenACC users of Graphite, the check can be re-introduced but
restricted to non-OpenACC functions.
gcc/ChangeLog:
* graphite-scop-detection.c (scop_detection::harmful_loop_in_region):
Remove check for loops without data references.
|
|
The find_common_loop function is used in Graphite to obtain a common
super-loop of all loops inside a SCoP. The function is applied to the
loop of the destination block of the edge that leads into the SESE
region and the loop of the source block of the edge that exits the
region. The exit block is usually introduced by the canonicalization
of the loop structure that Graphite does to support its code
generation. If it is empty, it may happen that it belongs to the outer
fake loop. This way, build_alias_set may end up analysing
data-references with respect to this loop although there may exist a
proper super-loop of the SCoP loops. This does not seem to be correct
in general and it leads to problems with runtime alias check creation
which fails if executed on a loop without niter information.
gcc/ChangeLog:
* graphite-scop-detection.c (scop_context_loop): New function.
(build_alias_set): Use scop_context_loop instead of find_common_loop.
* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
* graphite.h (scop_context_loop): New declaration.
|
|
The default values of some parameters that restrict Graphite's
resource usage are too low for many OpenACC codes. Furthermore,
exceeding the limits does not alwas lead to user-visible diagnostic
messages.
This commit increases the parameter values on OpenACC functions. The
values were chosen to allow for the analysis of all "kernels" regions
in the SPEC ACCEL v1.3 benchmark suite. Warnings about exceeded
Graphite-related limits are added to the -fopt-info-missed
output. Those warnings are phrased in a uniform way that intentionally
refers to the "data-dependence analysis" of "OpenACC loops" instead of
"a failure in Graphite" to make them easier to understand for users.
gcc/ChangeLog:
* graphite-optimize-isl.c (optimize_isl): Adjust
param_max_isl_operations value for OpenACC functions and add
special warnings if value gets exceeded.
* graphite-scop-detection.c (build_scops): Likewise for
param_graphite_max_arrays_per_scop.
gcc/testsuite/ChangeLog:
* gcc.dg/goacc/graphite-parameter-1.c: New test.
* gcc.dg/goacc/graphite-parameter-2.c: New test.
|
|
The additional dependences introduced by partial redundancy
elimination proper and by the code hoisting step of the pass very
often cause Graphite to fail on OpenACC functions. On the other hand,
the pass can also enable the analysis of OpenACC loops (cf. e.g. the
loop-auto-transfer-4.f90 testcase), for instance, because full
redundancy elimination removes definitions that would otherwise
prevent the creation of runtime alias checks outside of the SCoP.
This commit disables the actual partial redundancy elimination step as
well as the code hoisting step of pass_pre on OpenACC functions that
might be handled by Graphite.
gcc/ChangeLog:
* tree-ssa-pre.c (insert): Skip any insertions in OpenACC
functions that might be processed by Graphite.
|
|
The loop invariant motion pass correctly refuses to move statements
out of a loop if any other statement in the loop is unanalyzable. The
pass does not know how to handle the OpenACC internal function calls
which was not necessary until recently when the OpenACC device
lowering pass was moved to a later position in the pass pipeline.
This commit changes pass_lim to ignore the OpenACC internal function
calls which do not contain any memory references. The hoisting enabled
by this change can be useful for the data-dependence analysis in
Graphite; for instance, in the outlined functions for OpenACC regions,
all invariant accesses to the ".omp_data_i" struct should be hoisted
out of the OpenACC loop. This is particularly important for variables
that were scalars in the original loop and which have been turned into
accesses to the struct by the outlining process. Not hoisting those
can prevent scalar evolution analysis which is crucial for Graphite.
Since any hoisting that introduces intermediate names - and hence,
"fake" dependences - inside the analyzed nest can be harmful to
data-dependence analysis, a flag to restrict the hoisting in OpenACC
functions is added to the pass. The pass instance that executes before
Graphite now runs with this flag set to true and the pass instance
after Graphite runs unrestricted.
A more precise way of selecting the statements for which hoisting
should be enabled is left for a future improvement.
gcc/ChangeLog:
* passes.def: Set restrict_oacc_hoisting to true for the early
pass_lim instance.
* tree-ssa-loop-im.c (movement_possibility): Add
restrict_oacc_hoisting flag to function; restrict movement if set.
(compute_invariantness): Add restrict_oacc_hoisting flag and pass it on.
(gather_mem_refs_stmt): Skip IFN_GOACC_LOOP and IFN_UNIQUE
calls.
(loop_invariant_motion_in_fun): Add restrict_oacc_hoisting flag and
pass it on.
(pass_lim::execute): Pass on new flags.
* tree-ssa-loop-manip.h (loop_invariant_motion_in_fun): Adjust declaration.
* gimple-loop-interchange.cc (pass_linterchange::execute): Adjust call to
loop_invariant_motion_in_fun.
|
|
This commit concerns loops in OpenACC "kernels" region that have been marked
up with an explicit "independent" clause by the user, but for which Graphite
found data dependences. A discussion on the private internal OpenACC mailing
list suggested that warning the user about the dependences woud be a more
acceptable solution than reverting the user's decision. This behavior is
implemented by the present commit.
gcc/ChangeLog:
* common.opt: Add flag Wopenacc-false-independent.
* omp-offload.c (oacc_loop_warn_if_false_independent): New function.
(oacc_loop_fixed_partitions): Call from here.
|