Age | Commit message (Collapse) | Author | Files | Lines |
|
Here at parse time the template argument f (an OVERLOAD) in A<f> gets
resolved ahead of time to the FUNCTION_DECL f<int>, and we defer marking
f<int> as used until instantiation (of g) as usual.
Later when instantiating g the type A<f> (where f has already been
resolved) is non-dependent, so tsubst_aggr_type avoids re-processing its
template arguments, and we end up never actually marking f<int> as used
(which means we never instantiate it) even though A<f>::h() later calls
it, leading to a link error.
This patch works around this issue by looking through ADDR_EXPR when
calling mark_used on the substituted callee of a CALL_EXPR.
PR c++/53164
PR c++/105848
gcc/cp/ChangeLog:
* pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Look through an
ADDR_EXPR callee when calling mark_used.
gcc/testsuite/ChangeLog:
* g++.dg/template/fn-ptr3.C: New test.
(cherry picked from commit 733a792a2b2e1662e738fa358b45a2720a8618a7)
|
|
In non-dependent23.C below we expect the Base::foo calls to
resolve to the second, third and fourth overloads respectively in light
of the cv-qualifiers of 'this' in each case. But ever since
r12-6075-g2decd2cabe5a4f, the calls incorrectly resolve to the first
overload at instantiation time.
This happens because the calls to Base::foo are all deemed
non-dependent (ever since r7-755-g23cb72663051cd made us ignore 'this'
dependence when considering the dependence of a non-static memfn call),
hence we end up checking the call ahead of time, using as the object
argument a dummy object of type Base. Since this object argument is
cv-unqualified, the calls in turn resolve to the unqualified overload
of baseDevice. Before r12-6075 this incorrect result would just get
silently discarded and we'd end up redoing OR at instantiation time
using 'this' as the object argument. But after r12-6075 we now reuse
this incorrect result at instantiation time.
This patch fixes this by making maybe_dummy_object respect the cv-quals
of (the non-lambda) 'this' when returning a dummy object. Thus, ahead
of time OR using a dummy object will give us the right answer that's
consistent with the instantiation time answer.
An earlier version of this patch didn't handle 'this'-capturing lambdas
correctly, which broke lambda-this22.C below.
PR c++/105637
gcc/cp/ChangeLog:
* tree.cc (maybe_dummy_object): When returning a dummy
object, respect the cv-quals of 'this' if available.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-this22.C: New test.
* g++.dg/template/non-dependent23.C: New test.
(cherry picked from commit 44a5bd6d933d86ed988fc4695aa00f122cf83eb4)
|
|
This patch makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINTS
of each template parameter except as necessary for declaration matching,
like we already do for the other constituent constraints of a declaration.
This patch also improves the CA104 implementation of explicit
specialization matching of a constrained function template inside a
class template, by considering the function's combined constraints
instead of just its trailing constraints. This allows us to correctly
handle the first three explicit specializations in concepts-spec2.C
below, but because we compare the constraints as a whole, it means we
incorrectly accept the fourth explicit specialization which writes #3's
constraints in a different way. For complete correctness here,
determine_specialization should use tsubst_each_template_parm_constraints
and template_parameter_heads_equivalent_p.
PR c++/100374
gcc/cp/ChangeLog:
* pt.cc (determine_specialization): Compare overall constraints
not just the trailing constraints.
(tsubst_each_template_parm_constraints): Define.
(tsubst_friend_function): Use it.
(tsubst_friend_class): Use it.
(tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINTS.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-spec2.C: New test.
* g++.dg/cpp2a/concepts-template-parm11.C: New test.
(cherry picked from commit 43c013df02fdb07f9b7a5e7e6669e6d69769d451)
|
|
Here the out-of-line definition of Z<T>::z causes duplicate_decls to
change z's type from using the primary template type Z<T> (which is also
the type of the injected class name) to the implicit instantiation Z<T>,
and this latter type lacks a TYPE_BINFO (although its TYPE_CANONICAL was
set by a special case in lookup_template_class to point to the former).
Later, when processing the non-dependent call z->foo(0), build_over_call
relies on the object argument's TYPE_BINFO to build the templated form
for this call, which fails because the object argument type has empty
TYPE_BINFO due to the above.
It seems weird that the implicit instantiation Z<T> doesn't have the
same TYPE_BINFO as the primary template type Z<T>, despite them being
proclaimed equivalent via TYPE_CANONICAL. So I tried also setting
TYPE_BINFO in the special case in lookup_template_class, but that led to
some problems with constrained partial specializations of the form Z<T>.
I'm not sure what, if anything, we ought to do about the subtle
differences between these two versions of the same type.
Fortunately it seems we don't need to rely on TYPE_BINFO at all in
build_over_call here -- the z_candidate struct already contains the
exact binfos we need to rebuild the BASELINK for the templated form.
PR c++/105758
gcc/cp/ChangeLog:
* call.cc (build_over_call): Use z_candidate::conversion_path
and ::access_path instead of TYPE_BINFO when building the
BASELINK for the templated form.
gcc/testsuite/ChangeLog:
* g++.dg/template/non-dependent24.C: New test.
(cherry picked from commit 4f84f12066953186cce4328b7f178d3daa2fe96e)
|
|
Here during cp_parser_single_declaration for #2, we were calling
associate_classtype_constraints for TPL<T> (the primary template type)
before maybe_process_partial_specialization could get a chance to
notice that we're in fact declaring a distinct constrained partial
spec and not redeclaring the primary template. This caused us to
emit a bogus error about differing constraints b/t the primary template
and #2's constraints. This patch fixes this by moving the call to
associate_classtype_constraints after the call to shadow_tag (which
calls maybe_process_partial_specialization) and adjusting shadow_tag to
use the return value of m_p_p_s.
Moreover, if we later try to define a constrained partial specialization
that's been declared earlier (as in the third testcase), then
maybe_new_partial_specialization correctly notices it's a redeclaration
and returns NULL_TREE. But in this case we also need to update TYPE to
point to the redeclared partial spec (it'll otherwise continue pointing
to the primary template type, eventually leading to a bogus error).
PR c++/96363
gcc/cp/ChangeLog:
* decl.cc (shadow_tag): Use the return value of
maybe_process_partial_specialization.
* parser.cc (cp_parser_single_declaration): Call shadow_tag
before associate_classtype_constraints.
* pt.cc (maybe_new_partial_specialization): Change return type
to bool. Take 'type' argument by mutable reference. Set 'type'
to point to the correct constrained specialization when
appropriate.
(maybe_process_partial_specialization): Adjust accordingly.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-partial-spec12.C: New test.
* g++.dg/cpp2a/concepts-partial-spec12a.C: New test.
* g++.dg/cpp2a/concepts-partial-spec13.C: New test.
(cherry picked from commit 97dc78d705a90c1ae83c78a7f2e24942cc3a6257)
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/101330
* openmp.cc (gfc_match_iterator): Remove left-over code from
development that could lead to a crash on invalid input.
gcc/testsuite/ChangeLog:
PR fortran/101330
* gfortran.dg/gomp/affinity-clause-7.f90: New test.
(cherry picked from commit 26bbe78f77f73bb66af1ac13d0deec888a3c6510)
|
|
|
|
Here we crash because we attempt to % by 0. Thus fixed.
PR c++/105634
gcc/cp/ChangeLog:
* call.cc (maybe_warn_class_memaccess): Avoid % by zero.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wclass-memaccess-7.C: New test.
|
|
get_memory_rtx tries hard to come up with a MEM_EXPR to record
in the memory attributes but in the last fallback fails to properly
account for an unknown offset and thus, as visible in this testcase,
incorrect alignment computed from set_mem_attributes. The following
rectifies both parts.
PR middle-end/106331
* builtins.cc (get_memory_rtx): Compute alignment from
the original address and set MEM_OFFSET to unknown when
we create a MEM_EXPR from the base object of the address.
* gfortran.dg/pr106331.f90: New testcase.
(cherry picked from commit e4ff11a8f2e80adb8ada69bf35ee6a1ab18a9c85)
|
|
The following makes sure to not use the original TBAA type for
looking up a value across an aggregate copy when we had to offset
the read.
2022-06-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/106131
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Force alias-set
zero when offsetting the read looking through an aggregate
copy.
* g++.dg/torture/pr106131.C: New testcase.
(cherry picked from commit 9701432ff79926a5dd3303be3417e0bd0c24140b)
|
|
The following fixes a mistake in looking up an extended operand
in the CSE of a truncated operation.
2022-06-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/106112
* tree-ssa-sccvn.cc (valueized_wider_op): Properly extend
a constant operand according to its type.
* gcc.dg/torture/pr106112.c: New testcase.
(cherry picked from commit 2dbb45d6dc0d20dc159b3d8e27ebb6825074827a)
|
|
The fold_to_nonsharp_ineq_using_bound folding ends up creating invalid
typed IL which confuses later foldings. The following fixes that.
2022-06-20 Richard Biener <rguenther@suse.de>
PR middle-end/106027
* fold-const.cc (fold_to_nonsharp_ineq_using_bound): Use the
type of the prevailing comparison for the new comparison type.
(fold_binary_loc): Use proper types for the A < X && A + 1 > Y
to A < X && A >= Y folding.
* gcc.dg/pr106027.c: New testcase.
(cherry picked from commit 713f2fd923442b1be620a44240ddf786ae0ab476)
|
|
When DSE asks whether __real a is using __imag a it gets a surprising
result when a is a FUNCTION_DECL. The following makes sure this case
is less surprising to callers but keeping the bail-out for the
non-decl case where it is true that PTA doesn't track aliases to code
correctly.
2022-06-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/105971
* tree-ssa-alias.cc (refs_may_alias_p_2): Put bail-out for
FUNCTION_DECL and LABEL_DECL refs after decl-decl disambiguation
to leak less surprising alias results.
* gcc.dg/torture/pr106971.c: New testcase.
(cherry picked from commit 8c2733e16ec1c0cdda3db4cdc5ad158a96a658e8)
|
|
For a [0][0] array we have to be careful when dividing by the element
size which is zero for the outermost dimension. Luckily the division
is only for an overflow check which is pointless for array size zero.
2022-06-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/105969
* gimple-ssa-sprintf.cc (get_origin_and_offset_r): Avoid division
by zero in overflow check.
* gcc.dg/pr105969.c: New testcase.
(cherry picked from commit edb9330c29fe8a0a0b76df6fafd6a223a4d0e41f)
|
|
When we got the simplification of bit-field-ref to view-convert
we lost the ability to detect FMAs since we cannot look through
_1 = {_10};
_11 = VIEW_CONVERT_EXPR<float>(_1);
the following amends the (view_convert CONSTRUCTOR) pattern
to handle this case.
2022-06-14 Richard Biener <rguenther@suse.de>
PR middle-end/105965
* match.pd (view_convert CONSTRUCTOR): Handle single-element
CTOR case.
* gcc.target/i386/pr105965.c: New testcase.
(cherry picked from commit 90467f0ad649d0817f9e034596a0fb85605b55af)
|
|
uninit diagnostics uses passing via reference and access attributes
but that iterates over function type arguments which can in some
cases appearantly outrun the actual arguments leading to ICEs.
The following simply ignores not present arguments.
2022-06-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/105946
* tree-ssa-uninit.cc (maybe_warn_pass_by_reference):
Do not look at arguments not specified in the function call.
(cherry picked from commit e07a876c07601e1f3a27420f7d055d20193c362c)
|
|
The following avoids the need to massage the target optimization
node at WPA time when we fixup the optimization node, copying
FP related flags from callee to caller. The target is already
set up to fixup, but that only works when not switching between
functions. After fixing that the fixup is then done at LTRANS
time when materializing the function.
2022-07-01 Richard Biener <rguenthert@suse.de>
PR target/105459
* config/i386/i386-options.cc (ix86_set_current_function):
Rebuild the target optimization node whenever necessary,
not only when the optimization node didn't change.
* gcc.dg/lto/pr105459_0.c: New testcase.
(cherry picked from commit 4c94382a132a4b2b9d020806549a006fa6764d1b)
|
|
|
|
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/104313
* trans-decl.cc (gfc_generate_return): Do not generate conflicting
fake results for functions with no result variable under -ff2c.
gcc/testsuite/ChangeLog:
PR fortran/104313
* gfortran.dg/pr104313.f: New test.
(cherry picked from commit 517fb1a78102df43f052c6934c27dd51d786aff7)
|
|
|
|
|
|
Testing has found that using load and store vector pair for block copies
can result in a slow down on power10. This patch disables using the
vector pair instructions for block copies if we are tuning for power10.
2022-06-11 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Do
not generate block copies with vector pair instructions if we are
tuning for power10. Back port from master branch.
|
|
In check_new_reg_p, the nregs of a du chain is computed by obtaining the
MODE of the first element in the chain, and then calling
hard_regno_nregs() with the MODE. But the first element of the chain can
be a DEBUG_INSN whose mode need not be the same as the rest of the
elements in the du chain. This was resulting in fcompare-debug failure
as check_new_reg_p was returning a different result with -g for the same
candidate register. We can instead obtain nregs from the du chain
itself.
2022-06-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc/
PR rtl-optimization/105041
* regrename.cc (check_new_reg_p): Use nregs value from du chain.
gcc/testsuite/
PR rtl-optimization/105041
* gcc.target/powerpc/pr105041.c: New test.
(cherry picked from commit 3e16b4359e86b36676ed01219e6deafa95f3c16b)
|
|
|
|
Merge up to r12-8566-g8c57e8005db4864ecfba791d788f0b1dc6110f3b (13th July 2022)
|
|
|
|
Ensure that the "max_vf" figure used for the "safelen" attribute is large
enough for the largest configured offload device.
This change gives ~10x speed improvement on the Bablestream "dot" benchmark for
AMD GCN.
gcc/ChangeLog:
* gimple-loop-versioning.cc (loop_versioning::loop_versioning): Add
comment.
* omp-general.cc (omp_max_simd_vf): New function.
* omp-general.h (omp_max_simd_vf): New prototype.
* omp-low.cc (lower_rec_simd_input_clauses): Select largest from
omp_max_vf, omp_max_simt_vf, and omp_max_simd_vf.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_amdgcn_offloading_enabled): New.
(check_effective_target_nvptx_offloading_enabled): New.
* gcc.dg/gomp/target-vf.c: New test.
|
|
|
|
As the testcase in PR 105860 shows, the code that tries to re-use the
handled_component chains in SRA can be horribly confused by unions,
where it thinks it has found a compatible structure under which it can
chain the references, but in fact it found the type it was looking
for elsewhere in a union and generated a write to a completely wrong
part of an aggregate.
I don't remember whether the plan was to support unions at all in
build_reconstructed_reference but it can work, to an extent, if we
make sure that we start the search only outside the outermost union,
which is what the patch does (and the extra testcase verifies).
Additionally, this commit also contains sqashed in it a backport of
b984b84cbe4bf026edef2ba37685f3958a1dc1cf which fixes the testcase
gcc.dg/tree-ssa/alias-access-path-13.c for many 32-bit targets.
gcc/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* tree-sra.cc (build_reconstructed_reference): Start expr
traversal only just below the outermost union.
gcc/testsuite/ChangeLog:
2022-07-01 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/105860
* gcc.dg/tree-ssa/alias-access-path-13.c: New test.
* gcc.dg/tree-ssa/pr105860.c: Likewise.
(cherry picked from commit b110e5283e368b5377e04766e4ff82cd52634208)
|
|
|
|
(mult (sign_extend:DI rj:SI) (sign_extend:DI rk:SI)) should be
"mulw.d.w", not "mul.d".
gcc/ChangeLog:
* config/loongarch/loongarch.md (mulsidi3_64bit): Use mulw.d.w
instead of mul.d.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/mulw_d_w.c: New test.
* gcc.c-torture/execute/mul-sext.c: New test.
(cherry picked from commit 1fa42d62140b56589771eb3d46f89c810bfc8e0a)
|
|
|
|
This is a backport of the fix for PR target/105930 from mainline to the
gcc12 release branch.
2022-07-09 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
PR target/105930
* config/i386/i386.md (*<any_or>di3_doubleword): Split after
reload. Use rtx_equal_p to avoid creating memory-to-memory moves,
and emit NOTE_INSN_DELETED if operand[2] is zero (i.e. with -O0).
|
|
|
|
Fix-up for recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
gcc/
* lto-cgraph.cc (input_offload_tables) <LTO_symtab_edge>: Correct
'fn2' computation.
libgomp/
* testsuite/libgomp.c-c++-common/requires-1.c: Add 'dg-note's.
* testsuite/libgomp.c-c++-common/requires-2.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-3.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-7.c: Likewise.
* testsuite/libgomp.fortran/requires-1.f90: Likewise.
(cherry picked from commit faa0c328ee65f0d6d65d6e20181d26e336071919)
|
|
frame->mask or frame->fmask is zero.
Under the LA architecture, when the stack is dropped too far, the process
of dropping the stack is divided into two steps.
step1: After dropping the stack, save callee saved registers on the stack.
step2: The rest of it.
The stack drop operation is optimized when frame->total_size minus
frame->sp_fp_offset is an integer multiple of 4096, can reduce the number
of instructions required to drop the stack. However, this optimization is
not effective because of the original calculation method
The following case:
int main()
{
char buf[1024 * 12];
printf ("%p\n", buf);
return 0;
}
As you can see from the generated assembler, the old GCC has two more
instructions than the new GCC, lines 14 and line 24.
new old
10 main: | 11 main:
11 addi.d $r3,$r3,-16 | 12 lu12i.w $r13,-12288>>12
12 lu12i.w $r13,-12288>>12 | 13 addi.d $r3,$r3,-2032
13 lu12i.w $r5,-12288>>12 | 14 ori $r13,$r13,2016
14 lu12i.w $r12,12288>>12 | 15 lu12i.w $r5,-12288>>12
15 st.d $r1,$r3,8 | 16 lu12i.w $r12,12288>>12
16 add.d $r12,$r12,$r5 | 17 st.d $r1,$r3,2024
17 add.d $r3,$r3,$r13 | 18 add.d $r12,$r12,$r5
18 add.d $r5,$r12,$r3 | 19 add.d $r3,$r3,$r13
19 la.local $r4,.LC0 | 20 add.d $r5,$r12,$r3
20 bl %plt(printf) | 21 la.local $r4,.LC0
21 lu12i.w $r13,12288>>12 | 22 bl %plt(printf)
22 add.d $r3,$r3,$r13 | 23 lu12i.w $r13,8192>>12
23 ld.d $r1,$r3,8 | 24 ori $r13,$r13,2080
24 or $r4,$r0,$r0 | 25 add.d $r3,$r3,$r13
25 addi.d $r3,$r3,16 | 26 ld.d $r1,$r3,2024
26 jr $r1 | 27 or $r4,$r0,$r0
| 28 addi.d $r3,$r3,2032
| 29 jr $r1
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_compute_frame_info):
Modify fp_sp_offset and gp_sp_offset's calculation method,
when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD
or UNITS_PER_FP_REG.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/prolog-opt.c: New test.
(cherry picked from commit aa8fd7f65683ef9c3b6d2e9306bea2f28b5cadf7)
|
|
|
|
Similar to how the other 'mkoffload's got changed in
recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
This also means finally switching Intel MIC 'mkoffload' to
'GOMP_offload_register_ver', 'GOMP_offload_unregister_ver',
making 'GOMP_offload_register', 'GOMP_offload_unregister'
legacy entry points.
gcc/
* config/i386/intelmic-mkoffload.cc (generate_host_descr_file)
(prepare_target_image, main): Handle OpenMP 'requires'.
(generate_host_descr_file): Switch to 'GOMP_offload_register_ver',
'GOMP_offload_unregister_ver'.
libgomp/
* target.c (GOMP_offload_register, GOMP_offload_unregister):
Denote as legacy entry points.
* testsuite/lib/libgomp.exp
(check_effective_target_offload_target_any): New proc.
* testsuite/libgomp.c-c++-common/requires-1.c: Enable for
'offload_target_any'.
* testsuite/libgomp.c-c++-common/requires-3.c: Likewise.
* testsuite/libgomp.c-c++-common/requires-7.c: Likewise.
* testsuite/libgomp.fortran/requires-1.f90: Likewise.
(cherry picked from commit 9ef714539cb7cc1cd746312fd5dcc987bf167471)
|
|
The recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp" changed the
'GOMP_offload_register_ver' interface but didn't change
'GOMP_offload_unregister_ver' accordingly, so we're no longer
actually unregistering.
gcc/
* config/gcn/mkoffload.cc (process_obj): Clarify 'target_data' ->
'[...]_data'.
* config/nvptx/mkoffload.cc (process): Likewise.
libgomp/
* target.c (GOMP_offload_register_ver): Clarify 'target_data' ->
'data'.
(GOMP_offload_unregister_ver): Likewise. Fix up 'target_data'.
(cherry picked from commit 3f05e03d6cfdf723ca0556318b6a9aa37be438e7)
|
|
Clean up for recent commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0
"OpenMP: Move omp requires checks to libgomp".
gcc/
* omp-general.h (enum omp_requires): Use 'GOMP_REQUIRES_[...]'.
include/
* gomp-constants.h (OMP_REQUIRES_[...]): Update comment.
(cherry picked from commit 2f0d819a81edee50a98a8a05eed585f0a72bb932)
|
|
|
|
gcc/c-family/ChangeLog:
* known-headers.cc (get_stdlib_header_for_name): Add <time.h>
names.
gcc/testsuite/ChangeLog:
* g++.dg/spellcheck-stdlib.C: Check <ctime> types and functions.
(cherry picked from commit d489ec082ea214109ff54071410f8cd00344e654)
|
|
The <https://gcc.gnu.org/pipermail/gcc/2022-May/238679.html> thread
seems to have concluded that -Wformat shouldn't warn about
printf((const char*) u8"test %d\n", 1);
saying "format string is not an array of type 'char'". This code
is not an aliasing violation, and there are no I/O functions for u8
strings, so the const char * cast is OK and shouldn't be disregarded.
PR c++/105626
gcc/c-family/ChangeLog:
* c-format.cc (check_format_arg): Don't emit -Wformat warnings with
u8 strings.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wformat-char8_t-1.C: New test.
(cherry picked from commit 543828e79bfa63ef26b11a2c9ea81fd7905f33aa)
|
|
|
|
This commit reverts OG12 commits
commit 6ab6303f61c812d6e2f05d44af5cc79815c57bb8
'OpenMP 5.0: requires directive'
and
commit 47e5ba646c20fdb64fa233dd570f155880cafd04
'[WIP] OpenMP 5.0: requires directive: workaround to fix libgomp IntelMIC plugin build'
And replaces those by the upstream version, commit r13-1458, i.e.
(cherry picked from commit 683f11843974f0bdf42f79cdcbb0c2b43c7b81b0)
It also updates
* libgomp/plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices)
to permit GOMP_REQUIRES_UNIFIED_ADDRESS | GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
moved from the now no-longer existing function GOMP_OFFLOAD_supported_features;
the flags were added in OG12 commit 12d14a9a255c1cc10e4506935327aabd9766967d,
'amdgcn: libgomp plugin USM implementation'.
And, likewise, it updates
* libgomp/plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices)
from OG12 commit f3fd38e31fafdc7f6b6a721b53bdc595c4ec1e09,
'libgomp, nvptx: report USM'.
* * *
Handle reverse_offload, unified_address, and unified_shared_memory
requirements in libgomp by saving them alongside the offload table.
When the device lto1 runs, it extracts the data for mkoffload. The
latter than passes the value on to GOMP_offload_register_ver.
lto1 (either the host one, with -flto [+ ENABLE_OFFLOADING], or in the
offload-device lto1) also does the the consistency check is done,
erroring out when the 'omp requires' clause use is inconsistent.
For all in-principle supported devices, if a requirement cannot be fulfilled,
the device is excluded from the (supported) devices list. Currently, none of
those requirements are marked as supported for any of the non-host devices.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_target_data, c_parser_omp_target_update,
c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Set
OMP_REQUIRES_TARGET_USED.
(c_parser_omp_requires): Remove sorry.
gcc/ChangeLog:
* config/gcn/mkoffload.cc (process_asm): Write '#include <stdint.h>'.
(process_obj): Pass omp_requires_mask to GOMP_offload_register_ver.
(main): Ask lto1 to obtain omp_requires_mask and pass it on.
* config/nvptx/mkoffload.cc (process, main): Likewise.
* lto-cgraph.cc (omp_requires_to_name): New.
(input_offload_tables): Save omp_requires_mask.
(output_offload_tables): Read it, check for consistency,
save value for mkoffload.
* omp-low.cc (lower_omp_target): Force output_offloadtables
call for OMP_REQUIRES_TARGET_USED.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_target_data,
cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data,
cp_parser_omp_target_update): Set OMP_REQUIRES_TARGET_USED.
(cp_parser_omp_requires): Remove sorry.
gcc/fortran/ChangeLog:
* openmp.cc (gfc_match_omp_requires): Remove sorry.
* parse.cc (decode_omp_directive): Don't regard 'declare target'
as target usage for 'omp requires'; add more flags to
omp_requires_mask.
include/ChangeLog:
* gomp-constants.h (GOMP_VERSION): Bump to 2.
(GOMP_REQUIRES_UNIFIED_ADDRESS, GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
GOMP_REQUIRES_REVERSE_OFFLOAD, GOMP_REQUIRES_TARGET_USED):
New defines.
libgomp/ChangeLog:
* libgomp-plugin.h (GOMP_OFFLOAD_get_num_devices): Add
omp_requires_mask arg.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Likewise;
return -1 when device available but omp_requires_mask != 0.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise.
* oacc-host.c (host_get_num_devices, host_openacc_get_property):
Update call.
* oacc-init.c (resolve_device, acc_init_1, acc_shutdown_1,
goacc_attach_host_thread_to_device, acc_get_num_devices,
acc_set_device_num, get_property_any): Likewise.
* target.c (omp_requires_mask): New global var.
(gomp_requires_to_name): New.
(GOMP_offload_register_ver): Handle passed omp_requires_mask.
(gomp_target_init): Handle omp_requires_mask.
* libgomp.texi (OpenMP 5.0): Update requires impl. status.
(OpenMP 5.1): Add a missed item.
(OpenMP 5.2): Mark linear-clause change as supported in C/C++.
* testsuite/libgomp.c-c++-common/requires-1-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-1.c: New test.
* testsuite/libgomp.c-c++-common/requires-2-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-2.c: New test.
* testsuite/libgomp.c-c++-common/requires-3-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-3.c: New test.
* testsuite/libgomp.c-c++-common/requires-4-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-4.c: New test.
* testsuite/libgomp.c-c++-common/requires-5-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-5.c: New test.
* testsuite/libgomp.c-c++-common/requires-6.c: New test.
* testsuite/libgomp.c-c++-common/requires-7-aux.c: New test.
* testsuite/libgomp.c-c++-common/requires-7.c: New test.
* testsuite/libgomp.fortran/requires-1-aux.f90: New test.
* testsuite/libgomp.fortran/requires-1.f90: New test.
liboffloadmic/ChangeLog:
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_num_devices):
Return -1 when device available but omp_requires_mask != 0.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/requires-4.c: Update dg-*.
* c-c++-common/gomp/reverse-offload-1.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-2.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-3.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
* c-c++-common/gomp/target-device-ancestor-5.c: Likewise.
* gfortran.dg/gomp/target-device-ancestor-3.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-2.f90: Likewise. Move
post-FE checks to ...
* gfortran.dg/gomp/target-device-ancestor-2a.f90: ... this new file.
* gfortran.dg/gomp/requires-8.f90: Update as we don't regard
'declare target' for the 'requires' usage requirement.
Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
Merge up to r12-8550-g92d1e271a92dff333314f18379e61d8b3d84d2bd (5th July 2022)
|
|
Fortran part to C/C++
commit r13-1002-g03b71406323ddc065b1d7837d8b43b17e4b048b5
gcc/fortran/ChangeLog:
* gfortran.h (gfc_omp_namelist): Update by creating 'linear' struct,
move 'linear_op' as 'op' to id and add 'old_modifier' to it.
* dump-parse-tree.cc (show_omp_namelist): Update accordingly.
* module.cc (mio_omp_declare_simd): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
* openmp.cc (resolve_omp_clauses): Likewise; accept new-style
'val' modifier with do/simd.
(gfc_match_omp_clauses): Handle OpenMP 5.2 linear clause syntax.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2): Mark linear-clause change as 'Y'.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/linear-4.c: New test.
* gfortran.dg/gomp/linear-2.f90: New test.
* gfortran.dg/gomp/linear-3.f90: New test.
* gfortran.dg/gomp/linear-4.f90: New test.
* gfortran.dg/gomp/linear-5.f90: New test.
* gfortran.dg/gomp/linear-6.f90: New test.
* gfortran.dg/gomp/linear-7.f90: New test.
* gfortran.dg/gomp/linear-8.f90: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit c3297044f0055880dd23ffbf641aa3a5860197e1)
|