Age | Commit message (Collapse) | Author | Files | Lines |
|
This merges mainline's
r14-1301-gd64e8e1224708e
Fortran/OpenMP: Add parsing support for allocators/allocate directives
into OG13.
In theory, that just replaces (reverts) the OG13-only commit
6ce5ee77f73 Add parsing support for allocate directive (OpenMP 5.0)
by Hafiz Abid Qadeer <abidh@codesourcery.com>
However, the replacement is not full - thus, this commit does much more:
While mainline support 'omp allocators' besides 'omp allocate' and
the latter also for stack and pointer variables,
the OG13 branch supports actual allocation with the directive version.
* Thus, this commit wires the mainline's executive allocate + new allocators
with the OG13 GOMP_alloc/GOMP_free handling.
* It also handles now the 'align' clause/modifier.
* By defaulting to 'omp allocators', it re-uses the ALLOCATE clause in the
ME instead of using the new ALLOCATOR clause, which has now been removed
from the ME code.
On the FE side, it also includes a modified version of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634096.html
[Patch] OpenMP/Fortran: Handle unlisted items in 'omp allocators' + exec. 'omp allocate'
The whole patch cannot be easily disentangled and parts of
6ce5ee77f73 remain in a revised form. For those bits, the
credits belong to Abid.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_namelist, show_omp_node,
show_code_node): Dump EXEC_OMP_ALLOCATORS update for struct changes.
* gfortran.h (enum gfc_statement): Add ST_OMP_(END_)ALLOCATORS.
(gfc_omp_namelist): Add allocator to union u2.
(enum): Remove OMP_LIST_ALLOCATOR.
(gfc_namespace): Add omp_allocate.
(enum gfc_exec_op): Add EXEC_OMP_ALLOCATORS.
(gfc_resolve_omp_allocate): New prototype.
* match.cc (gfc_free_omp_namelist): Free u2.allocator.
* match.h (gfc_match_omp_allocators): New prototype.
* openmp.cc (gfc_omp_directives): Enable allocate/allocator.
(gfc_match_omp_variable_list): Add reject_common_vars Boolean arg.
(enum omp_mask2): Remove OMP_CLAUSE_ALLOCATOR.
(gfc_match_omp_clauses): Update for struct change.
(OMP_ALLOCATE_CLAUSES): Remove.
(OMP_ALLOCATORS_CLAUSES): Add.
(gfc_match_omp_allocate): Replace by upstream version.
(gfc_match_omp_allocators, is_predefined_allocator,
gfc_resolve_omp_allocate): New.
(verify_omp_clauses_symbol_dups): Update for allocate/allocator.
(resolve_omp_clauses): Update from mainline.
(omp_code_to_statement): Add EXEC_OMP_ALLOCATORS.
prepare_omp_allocated_var_list_for_cleanup,
check_allocate_directive_restrictions, EMPTY_VAR_LIST,
gfc_resolve_omp_allocate): Remove.
(gfc_resolve_omp_directive): Add EXEC_OMP_ALLOCATORS.
* parse.cc (check_omp_allocate_stmt, decode_omp_directive
next_statement, case_omp_decl, gfc_ascii_statement,
verify_st_order, parse_openmp_allocate_block,
parse_omp_structured_block, parse_executable): Update from mainline.
* resolve.cc (gfc_resolve_blocks, resolve_codes): Handle allocate
and allocator.
* st.cc (gfc_free_statement): And EXEC_OMP_ALLOCATORS.
* trans-decl.cc (gfc_trans_deferred_vars): Use OMP_LIST_ALLOCATE
instead of OMP_LIST_ALLOCATOR.
* trans-openmp.cc (gfc_trans_omp_clauses): Remove OMP_LIST_ALLOCATOR,
update allocate handling.
(gfc_trans_omp_allocate): Renamed to ...
(gfc_trans_omp_allocators): ... this.
(gfc_split_omp_clauses, gfc_trans_omp_directive): Update.
* trans.cc (trans_code): Handle EXEC_OMP_ALLOCATORS.
gcc/ChangeLog:
* omp-low.cc (scan_sharing_clauses): Update message for directive use.
(lower_omp_allocate, lower_omp_1): Update for clause changes; handle
alignment.
* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE_ALLOCATOR.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.cc (omp_clause_num_ops): Likewise.
* tree.h (OMP_ALLOCATE_DECL, OMP_ALLOCATE_ALLOCATOR): Remove.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/allocate-2a.f90: Add exec stmt.
* testsuite/libgomp.fortran/allocate-4.f90: Update dg-error
* testsuite/libgomp.fortran/allocate-5.f90: Likewise; add exec stmt.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-2.f90: Update dg-warning.
* gfortran.dg/gomp/allocate-4.f90: New upstream test.
* gfortran.dg/gomp/allocate-5.f90: New upstream test.
* gfortran.dg/gomp/allocate-6.f90: New upstream test.
* gfortran.dg/gomp/allocate-7.f90: New upstream test.
* gfortran.dg/gomp/allocate-4a.f90: Renamed from allocate-4.f90
* gfortran.dg/gomp/allocate-6a.f90: Renamed from allocate-6.f90
* gfortran.dg/gomp/allocate-7a.f90: Renamed from allocate-7.f90
* gfortran.dg/gomp/allocate-9a.f90: New test.
* gfortran.dg/gomp/allocators-1.f90: New upstream test.
* gfortran.dg/gomp/allocators-2.f90: New upstream test.
|
|
This patch adds support for OpenMP 5.0 "declare mapper" functionality
for C++. I've merged it to og13 based on the last version
posted upstream, with some minor changes due to the newly-added
'present' map modifier support. There's also a fix to splay-tree
traversal in gimplify.cc:omp_instantiate_implicit_mappers, and this patch
omits the rearrangement of gimplify.cc:gimplify_{scan,adjust}_omp_clauses
that I separated out into its own patch and applied (to og13) already.
2023-06-30 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (omp_mapper_list): Add forward declaration.
(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
* c-omp.cc (c_omp_find_nested_mappers): New function.
(remap_mapper_decl_info): New struct.
(remap_mapper_decl_1, omp_instantiate_mapper,
c_omp_instantiate_mappers): New functions.
gcc/cp/
* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
case.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function.
* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks.
* cp-tree.h (lang_decl_base): Add omp_declare_mapper_p field. Recount
spare bits comment.
(DECL_OMP_DECLARE_MAPPER_P): New macro.
(omp_mapper_id, cp_check_omp_declare_mapper, omp_instantiate_mappers,
cxx_omp_finish_mapper_clauses, cxx_omp_mapper_lookup,
cxx_omp_extract_mapper_directive, cxx_omp_map_array_section: Add
prototypes.
* decl.cc (check_initializer): Add OpenMP declare mapper support.
(cp_finish_decl): Set DECL_INITIAL for OpenMP declare mapper var decls
as appropriate.
* decl2.cc (mark_used): Instantiate OpenMP "declare mapper" magic var
decls.
* error.cc (dump_omp_declare_mapper): New function.
(dump_simple_decl): Use above.
* parser.cc (cp_parser_omp_clause_map): Add KIND parameter. Support
"mapper" modifier.
(cp_parser_omp_all_clauses): Add KIND argument to
cp_parser_omp_clause_map call.
(cp_parser_omp_target): Call omp_instantiate_mappers before
finish_omp_clauses.
(cp_parser_omp_declare_mapper): New function.
(cp_parser_omp_declare): Add "declare mapper" support.
* pt.cc (tsubst_decl): Adjust name of "declare mapper" magic var decls
once we know their type.
(tsubst_omp_clauses): Call omp_instantiate_mappers before
finish_omp_clauses, for target regions.
(tsubst_expr): Support OMP_DECLARE_MAPPER nodes.
(instantiate_decl): Instantiate initialiser (i.e definition) for OpenMP
declare mappers.
* semantics.cc (gimplify.h): Include.
(omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive,
cxx_omp_map_array_section, cp_check_omp_declare_mapper): New functions.
(finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and
GOMP_MAP_POP_MAPPER_NAME artificial clauses.
(omp_target_walk_data): Add MAPPERS field.
(finish_omp_target_clauses_r): Scan for uses of struct/union/class type
variables.
(finish_omp_target_clauses): Create artificial mapper binding clauses
for used structs/unions/classes in offload region.
gcc/fortran/
* parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes
(for additions to omp-general.h).
gcc/
* gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field.
(new_omp_context): Initialise IMPLICIT_MAPPERS hash map.
(delete_omp_context): Delete IMPLICIT_MAPPERS hash map.
(instantiate_mapper_info): New structs.
(remap_mapper_decl_1, omp_mapper_copy_decl, omp_instantiate_mapper,
omp_instantiate_implicit_mappers): New functions.
(gimplify_scan_omp_clauses): Handle MAPPER_BINDING clauses.
(gimplify_adjust_omp_clauses): Instantiate implicit declared mappers.
(gimplify_omp_declare_mapper): New function.
(gimplify_expr): Call above function.
* langhooks-def.h (lhd_omp_finish_mapper_clauses,
lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
lhd_omp_map_array_section): Add prototypes.
(LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define macros.
(LANG_HOOK_DECLS): Add above macros.
* langhooks.cc (lhd_omp_finish_mapper_clauses,
lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
lhd_omp_map_array_section): New dummy functions.
* langhooks.h (lang_hooks_for_decls): Add OMP_FINISH_MAPPER_CLAUSES,
OMP_MAPPER_LOOKUP, OMP_EXTRACT_MAPPER_DIRECTIVE, OMP_MAP_ARRAY_SECTION
hooks.
* omp-general.h (omp_name_type<T>): Add templatized struct, hash type
traits (for omp_name_type<tree> specialization).
(omp_mapper_list<T>): Add struct.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE__MAPPER_BINDING_.
* tree-pretty-print.cc (dump_omp_clause): Support GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clauses. Support OMP_CLAUSE__MAPPER_BINDING_ and OMP_DECLARE_MAPPER.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE__MAPPER_BINDING_.
* tree.def (OMP_DECLARE_MAPPER): New tree code.
* tree.h (OMP_DECLARE_MAPPER_ID, OMP_DECLARE_MAPPER_DECL,
OMP_DECLARE_MAPPER_CLAUSES): New defines.
(OMP_CLAUSE__MAPPER_BINDING__ID, OMP_CLAUSE__MAPPER_BINDING__DECL,
OMP_CLAUSE__MAPPER_BINDING__MAPPER): New defines.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clause types.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update error scan output.
* c-c++-common/gomp/declare-mapper-3.c: New test (only enabled for C++
for now).
* c-c++-common/gomp/declare-mapper-4.c: Likewise.
* c-c++-common/gomp/declare-mapper-5.c: Likewise.
* c-c++-common/gomp/declare-mapper-6.c: Likewise.
* c-c++-common/gomp/declare-mapper-7.c: Likewise.
* c-c++-common/gomp/declare-mapper-8.c: Likewise.
* c-c++-common/gomp/declare-mapper-9.c: Likewise.
* c-c++-common/gomp/declare-mapper-12.c: Likewise.
* g++.dg/gomp/declare-mapper-1.C: New test.
* g++.dg/gomp/declare-mapper-2.C: New test.
libgomp/
* testsuite/libgomp.c++/declare-mapper-1.C: New test.
* testsuite/libgomp.c++/declare-mapper-2.C: New test.
* testsuite/libgomp.c++/declare-mapper-3.C: New test.
* testsuite/libgomp.c++/declare-mapper-4.C: New test.
* testsuite/libgomp.c++/declare-mapper-5.C: New test.
* testsuite/libgomp.c++/declare-mapper-6.C: New test.
* testsuite/libgomp.c++/declare-mapper-7.C: New test.
* testsuite/libgomp.c++/declare-mapper-8.C: New test.
* testsuite/libgomp.c-c++-common/declare-mapper-9.c: New test (only
enabled for C++ for now).
* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
|
|
This patch changes the mapping node arrangement used for array components
of derived types, e.g.:
type T
integer, pointer, dimension(:) :: arrptr
end type T
type(T) :: tvar
[...]
!$omp target map(tofrom: tvar%arrptr)
This will currently be mapped using three mapping nodes:
GOMP_MAP_TO tvar%arrptr (the descriptor)
GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data)
GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (a pointer to the array data)
This follows OMP 5.0, 2.19.7.1 (or OpenMP 5.2, 5.8.3) "map Clause":
"If a list item in a map clause is an associated pointer and the
pointer is not the base pointer of another list item in a map clause
on the same construct, then it is treated as if its pointer target
is implicitly mapped in the same clause. For the purposes of the map
clause, the mapped pointer target is treated as if its base pointer
is the associated pointer."
However, we can also write this:
map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))
and then instead we should follow (OpenMP 5.2, 5.8.3 "map Clause"):
"For map clauses on map-entering constructs, if any list item has a base
pointer for which a corresponding pointer exists in the data environment
upon entry to the region and either a new list item or the corresponding
pointer is created in the device data environment on entry to the region,
then:
1. [Fortran] The corresponding pointer variable is associated with
a pointer target that has the same rank and bounds as the pointer
target of the original pointer, such that the corresponding list item
can be accessed through the pointer in a target region.
2. The corresponding pointer variable becomes an attached pointer
for the corresponding list item."
With this patch you can write the above mappings, and the mapping nodes
used to map pointers to array sections (with descriptors) now look
like this:
1) map(to: tvar%arrptr) -->
GOMP_MAP_TO [implicit] *tvar%arrptr%data (the array data)
GOMP_MAP_TO_PSET tvar%arrptr (the descriptor)
GOMP_MAP_ATTACH_DETACH tvar%arrptr%data
2) map(tofrom: tvar%arrptr(3:8) -->
GOMP_MAP_TOFROM *tvar%arrptr%data(3) (size 8-3+1, etc.)
GOMP_MAP_TO_PSET tvar%arrptr
GOMP_MAP_ATTACH_DETACH tvar%arrptr%data (bias 3, etc.)
In this case, we can determine in the front-end that the
whole-array/pointer mapping (1) is only needed to map the pointer --
so we drop it entirely. (Note also that we set -- early -- the
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P flag for whole-array-via-pointer
mappings. See below.)
In the middle end, we process mappings using the struct sibling-list
handling machinery by moving the "GOMP_MAP_TO_PSET" node from the middle
of the group of three mapping nodes to the proper sorted position after
the GOMP_MAP_STRUCT mapping:
GOMP_MAP_STRUCT tvar (len: 1)
GOMP_MAP_TO_PSET tvar%arr (size: 64, etc.) <--. moved here
[...] |
GOMP_MAP_TOFROM *tvar%arrptr%data(3) ___|
GOMP_MAP_ATTACH_DETACH tvar%arrptr%data
In another case, if we have an array of derived-type values "dtarr",
and mappings like:
i = 1
j = 1
map(to: dtarr(i)%arrptr) map(tofrom: dtarr(j)%arrptr(3:8))
We still map the same way, but this time we cannot prove that the base
expressions "dtarr(i) and "dtarr(j)" are the same in the front-end.
So we keep both mappings, but we move the "[implicit]" mapping of the
full-array reference to the end of the clause list in gimplify.cc (by
adjusting the topological sorting algorithm):
GOMP_MAP_STRUCT dtvar (len: 2)
GOMP_MAP_TO_PSET dtvar(i)%arrptr
GOMP_MAP_TO_PSET dtvar(j)%arrptr
[...]
GOMP_MAP_TOFROM *dtvar(j)%arrptr%data(3) (size: 8-3+1)
GOMP_MAP_ATTACH_DETACH dtvar(j)%arrptr%data
GOMP_MAP_TO [implicit] *dtvar(i)%arrptr%data(1) (size: whole array)
GOMP_MAP_ATTACH_DETACH dtvar(i)%arrptr%data
Always moving "[implicit]" full-array mappings after array-section
mappings (without that bit set) means that we'll avoid copying the whole
array unnecessarily -- even in cases where we can't prove that the arrays
are the same.
The patch also fixes some bugs with "enter data" and "exit data"
directives with this new mapping arrangement. Also now if you have
mappings like this:
#pragma omp target enter data map(to: dv, dv%arr(1:20))
The whole of the derived-type variable "dv" is mapped, so the
GOMP_MAP_TO_PSET for the array-section mapping can be dropped:
GOMP_MAP_TO dv
GOMP_MAP_TO *dv%arr%data
GOMP_MAP_TO_PSET dv%arr <-- deleted (array section mapping)
GOMP_MAP_ATTACH_DETACH dv%arr%data
To accommodate for recent changes to mapping nodes made by
Tobias, this version of the patch avoids using GOMP_MAP_TO_PSET
for "exit data" directives, in favour of using the "correct"
GOMP_MAP_RELEASE/GOMP_MAP_DELETE kinds during early expansion.
2023-06-16 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* dependency.cc (gfc_omp_expr_prefix_same): New function.
* dependency.h (gfc_omp_expr_prefix_same): Add prototype.
* gfortran.h (gfc_omp_namelist): Add "duplicate_of" field to "u2"
union.
* trans-openmp.cc (dependency.h): Include.
(gfc_trans_omp_array_section): Adjust mapping node arrangement for
array descriptors. Use GOMP_MAP_TO_PSET or
GOMP_MAP_RELEASE/GOMP_MAP_DELETE with the OMP_CLAUSE_RELEASE_DESCRIPTOR
flag set.
(gfc_symbol_rooted_namelist): New function.
(gfc_trans_omp_clauses): Check subcomponent and subarray/element
accesses elsewhere in the clause list for pointers to derived types or
array descriptors, and adjust or drop mapping nodes appropriately.
Adjust for changes to mapping node arrangement.
(gfc_trans_oacc_executable_directive): Pass code op through.
gcc/
* gimplify.cc (omp_map_clause_descriptor_p): New function.
(build_omp_struct_comp_nodes, omp_get_attachment, omp_group_base): Use
above function.
(omp_tsort_mapping_groups): Process nodes that have
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P set after those that don't. Add
enter_exit_data parameter.
(omp_resolve_clause_dependencies): Remove GOMP_MAP_TO_PSET mappings if
we're mapping the whole containing derived-type variable.
(omp_accumulate_sibling_list): Adjust GOMP_MAP_TO_PSET handling.
Remove GOMP_MAP_ALWAYS_POINTER handling.
(gimplify_scan_omp_clauses): Pass enter_exit argument to
omp_tsort_mapping_groups. Don't adjust/remove GOMP_MAP_TO_PSET
mappings for derived-type components here.
* tree.h (OMP_CLAUSE_RELEASE_DESCRIPTOR): New macro.
gcc/testsuite/
* gfortran.dg/gomp/map-9.f90: Adjust scan output.
* gfortran.dg/gomp/map-subarray-2.f90: New test.
* gfortran.dg/gomp/map-subarray.f90: New test.
libgomp/
* testsuite/libgomp.fortran/map-subarray.f90: New test.
* testsuite/libgomp.fortran/map-subarray-2.f90: New test.
* testsuite/libgomp.fortran/map-subarray-3.f90: New test.
* testsuite/libgomp.fortran/map-subarray-4.f90: New test.
* testsuite/libgomp.fortran/map-subarray-6.f90: New test.
* testsuite/libgomp.fortran/map-subarray-7.f90: New test.
* testsuite/libgomp.fortran/map-subarray-8.f90: New test.
* testsuite/libgomp.fortran/map-subcomponents.f90: New test.
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Adjust for
descriptor-mapping changes. Remove XFAIL.
|
|
This patch is an extension and rewrite/rethink of the following two patches:
"OpenMP/OpenACC: Add inspector class to unify mapped address analysis"
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591977.html
"OpenMP: Handle reference-typed struct members"
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591978.html
The latter was reviewed here by Jakub:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595510.html with the
with the comment,
> Why isn't a reference to pointer handled that way too?
and that opened a whole can of worms... generally, C++ references were
not handled very consistently after the clause-processing code had been
extended several times already for both OpenACC and OpenMP, and many
cases of using C++ (and Fortran) references were broken. Even some
cases not involving references were being mapped incorrectly.
At present a single clause may be turned into several mapping nodes,
or have its mapping type changed, in several places scattered through
the front- and middle-end. The analysis relating to which particular
transformations are needed for some given expression has become quite hard
to follow. Briefly, we manipulate clause types in the following places:
1. During parsing, in c_omp_adjust_map_clauses. Depending on a set of
rules, we may change a FIRSTPRIVATE_POINTER (etc.) mapping into
ATTACH_DETACH, or mark the decl addressable.
2. In semantics.cc or c-typeck.cc, clauses are expanded in
handle_omp_array_sections (called via {c_}finish_omp_clauses, or in
finish_omp_clauses itself. The two cases are for processing array
sections (the former), or non-array sections (the latter).
3. In gimplify.cc, we build sibling lists for struct accesses, which
groups and sorts accesses along with their struct base, creating
new ALLOC/RELEASE nodes for pointers.
4. In gimplify.cc:gimplify_adjust_omp_clauses, mapping nodes may be
adjusted or created.
This patch doesn't completely disrupt this scheme, though clause
types are no longer adjusted in c_omp_adjust_map_clauses (step 1).
Clause expansion in step 2 (for C and C++) now uses a single, unified
mechanism, parts of which are also reused for analysis in step 3.
Rather than the kind-of "ad-hoc" pattern matching on addresses used to
expand clauses used at present, a new method for analysing addresses is
introduced. This does a recursive-descent tree walk on expression nodes,
and emits a vector of tokens describing each "part" of the address.
This tokenized address can then be translated directly into mapping nodes,
with the assurance that no part of the expression has been inadvertently
skipped or misinterpreted. In this way, all the variations of ways
pointers, arrays, references and component accesses might be combined
can be teased apart into easily-understood cases - and we know we've
"parsed" the whole address before we start analysis, so the right code
paths can easily be selected.
For example, a simple access "arr[idx]" might parse as:
base-decl access-indexed-array
or "mystruct->foo[x]" with a pointer "foo" component might parse as:
base-decl access-pointer component-selector access-pointer
A key observation is that support for "array" bases, e.g. accesses
whose root nodes are not structures, but describe scalars or arrays,
and also *one-level deep* structure accesses, have first-class support
in gimplify and beyond. Expressions that use deeper struct accesses
or e.g. multiple indirections were more problematic: some cases worked,
but lots of cases didn't. This patch reimplements the support for those
in gimplify.cc, again using the new "address tokenization" support.
An expression like "mystruct->foo->bar[0:10]" used in a mapping node will
translate the right-hand access directly in the front-end. The base for
the access will be "mystruct->foo". This is handled recursively in
gimplify.cc -- there may be several accesses of "mystruct"'s members
on the same directive, so the sibling-list building machinery can be
used again. (This was already being done for OpenACC, but the new
implementation differs somewhat in details, and is more robust.)
For OpenMP, in the case where the base pointer itself,
i.e. "mystruct->foo" here, is NOT mapped on the same directive, we
create a "fragile" mapping. This turns the "foo" component access
into a zero-length allocation (which is a new feature for the runtime,
so support has been added there too).
A couple of changes have been made to how mapping clauses are turned
into mapping nodes:
The first change is based on the observation that it is probably never
correct to use GOMP_MAP_ALWAYS_POINTER for component accesses (e.g. for
references), because if the containing struct is already mapped on the
target then the host version of the pointer in question will be corrupted
if the struct is copied back from the target. This patch removes all
such uses, across each of C, C++ and Fortran.
The second change is to the way that GOMP_MAP_ATTACH_DETACH nodes
are processed during sibling-list creation. For OpenMP, for pointer
components, we must map the base pointer separately from an array section
that uses the base pointer, so e.g. we must have both "map(mystruct.base)"
and "map(mystruct.base[0:10])" mappings. These create nodes such as:
GOMP_MAP_TOFROM mystruct.base
G_M_TOFROM *mystruct.base [len: 10*elemsize] G_M_ATTACH_DETACH mystruct.base
Instead of using the first of these directly when building the struct
sibling list then skipping the group using GOMP_MAP_ATTACH_DETACH,
leading to:
GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_TOFROM mystruct.base
we now introduce a new "mini-pass", omp_resolve_clause_dependencies, that
drops the GOMP_MAP_TOFROM for the base pointer, marks the second group
as having had a base-pointer mapping, then omp_build_struct_sibling_lists
can create:
GOMP_MAP_STRUCT mystruct [len: 1] GOMP_MAP_ALLOC mystruct.base [len: ptrsize]
This ends up working better in many cases, particularly those involving
references. (The "alloc" space is immediately overwritten by a pointer
attachment, so this is mildly more efficient than a redundant TO mapping
at runtime also.)
There is support in the address tokenizer for "arbitrary" base expressions
which aren't rooted at a decl, but that is not used as present because
such addresses are disallowed at parse time.
In the front-ends, the address tokenization machinery is mostly only
used for clause expansion and not for diagnostics at present. It could
be used for those too, which would allow more of my previous "address
inspector" implementation to be removed.
The new bits in gimplify.cc work with OpenACC also.
This version of the patch has been rebased for og13 and incorporates a
couple of follow-up patches.
2023-06-16 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_ACC_TARGET.
(omp_addr_token): Add forward declaration.
(c_omp_address_inspector): New class.
* c-omp.cc (c_omp_adjust_map_clauses): Mark decls addressable here, but
do not change any mapping node types.
(c_omp_address_inspector::unconverted_ref_origin,
c_omp_address_inspector::component_access_p,
c_omp_address_inspector::check_clause,
c_omp_address_inspector::get_root_term,
c_omp_address_inspector::map_supported_p,
c_omp_address_inspector::get_origin,
c_omp_address_inspector::maybe_unconvert_ref,
c_omp_address_inspector::maybe_zero_length_array_section,
c_omp_address_inspector::expand_array_base,
c_omp_address_inspector::expand_component_selector,
c_omp_address_inspector::expand_map_clause): New methods.
(omp_expand_access_chain): New function.
gcc/c/
* c-parser.cc (c_parser_oacc_all_clauses): Add TARGET parameter. Use
to select region type for c_finish_omp_clauses call.
(c_parser_oacc_loop): Update calls to c_parser_oacc_all_clauses.
(c_parser_oacc_compute): Likewise.
(c_parser_omp_target_enter_data): Support ATTACH kind.
(c_parser_omp_target_exit_data): Support DETACH kind.
* c-typeck.cc (handle_omp_array_sections_1,
handle_omp_array_sections, c_finish_omp_clauses): Use
c_omp_address_inspector class and OMP address tokenizer to analyze and
expand map clause expressions. Fix some diagnostics. Fix for
C_ORT_ACC_TARGET.
gcc/cp/
* parser.cc (cp_parser_oacc_all_clauses): Add TARGET parameter. Use
to select region type for finish_omp_clauses call.
(cp_parser_omp_target_enter_data): Support GOMP_MAP_ATTACH kind.
(cp_parser_omp_target_exit_data): Support GOMP_MAP_DETACH kind.
(cp_parser_oacc_declare): Update call to cp_parser_oacc_all_clauses.
(cp_parser_oacc_loop): Update calls to cp_parser_oacc_all_clauses.
(cp_parser_oacc_compute): Likewise.
* pt.cc (tsubst_expr): Use C_ORT_ACC_TARGET for call to
tsubst_omp_clauses for compute regions.
* semantics.cc (cp_omp_address_inspector): New class, derived from
c_omp_address_inspector.
(handle_omp_array_sections_1, handle_omp_array_sections,
finish_omp_clauses): Use cp_omp_address_inspector class and OMP address
tokenizer to analyze and expand OpenMP map clause expressions. Fix
some diagnostics. Support C_ORT_ACC_TARGET.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_array_section): Add OPENMP parameter.
Use GOMP_MAP_ATTACH_DETACH instead of GOMP_MAP_ALWAYS_POINTER for
derived type components.
(gfc_trans_omp_clauses): Update calls to gfc_trans_omp_array_section.
gcc/
* gimplify.cc (build_struct_comp_nodes): Don't process
GOMP_MAP_ATTACH_DETACH "middle" nodes here.
(omp_mapping_group): Add REPROCESS_STRUCT and FRAGILE booleans for
nested struct handling.
(omp_strip_components_and_deref, omp_strip_indirections): Remove
functions.
(omp_gather_mapping_groups_1): Initialise reprocess_struct and fragile
fields.
(omp_group_base): Handle GOMP_MAP_ATTACH_DETACH after GOMP_MAP_STRUCT.
(omp_index_mapping_groups_1): Skip reprocess_struct groups.
(omp_get_nonfirstprivate_group, omp_directive_maps_explicitly,
omp_resolve_clause_dependencies, omp_expand_access_chain,
omp_first_chained_access_token): New functions.
(omp_check_mapping_compatibility): Adjust accepted node combinations
for changes elsewhere.
(omp_accumulate_sibling_list): Add GROUP_MAP, ADDR_TOKENS, FRAGILE_P,
REPROCESSING_STRUCT, ADDED_TAIL parameters. Use OMP address tokenizer
to analyze addresses. Reimplement nested struct handling, and
implement "fragile groups".
(omp_build_struct_sibling_lists): Adjust for changes to
omp_accumulate_sibling_list. Recalculate bias for ATTACH_DETACH nodes
after GOMP_MAP_STRUCT nodes.
(gimplify_scan_omp_clauses): Call omp_resolve_clause_dependencies. Use
OMP address tokenizer.
(gimplify_adjust_omp_clauses_1): Use build_fold_indirect_ref_loc
instead of build_simple_mem_ref_loc.
* omp-general.cc (omp-general.h): Include.
(omp_addr_tokenizer): New namespace.
(omp_addr_tokenizer::omp_addr_token): New.
(omp_addr_tokenizer::omp_parse_component_selector,
omp_addr_tokenizer::omp_parse_ref,
omp_addr_tokenizer::omp_parse_pointer,
omp_addr_tokenizer::omp_parse_access_method,
omp_addr_tokenizer::omp_parse_access_methods,
omp_addr_tokenizer::omp_parse_structure_base,
omp_addr_tokenizer::omp_parse_structured_expr,
omp_addr_tokenizer::omp_parse_array_expr,
omp_addr_tokenizer::omp_access_chain_p,
omp_addr_tokenizer::omp_accessed_addr): New functions.
(omp_parse_expr, debug_omp_tokenized_addr): New functions.
* omp-general.h (omp_addr_tokenizer::access_method_kinds,
omp_addr_tokenizer::structure_base_kinds,
omp_addr_tokenizer::token_type,
omp_addr_tokenizer::omp_addr_token,
omp_addr_tokenizer::omp_access_chain_p,
omp_addr_tokenizer::omp_accessed_addr): New.
(omp_addr_token, omp_parse_expr): New.
* omp-low.cc (scan_sharing_clauses): Skip error check for references
to pointers.
* tree.h (OMP_CLAUSE_ATTACHMENT_MAPPING_ERASED): New macro.
gcc/testsuite/
* c-c++-common/gomp/clauses-2.c: Fix error output.
* c-c++-common/gomp/target-implicit-map-2.c: Adjust scan output.
* c-c++-common/gomp/target-50.c: Adjust scan output.
* c-c++-common/gomp/target-enter-data-1.c: Adjust scan output.
* g++.dg/gomp/static-component-1.C: New test.
* gcc.dg/gomp/target-3.c: Adjust scan output.
* gfortran.dg/gomp/map-9.c.f90: Adjust scan output.
libgomp/
* target.c (gomp_map_fields_existing): Use gomp_map_0len_lookup.
(gomp_attach_pointer): Allow attaching null pointers (or Fortran
"unassociated" pointers).
(gomp_map_vars_internal): Handle zero-sized struct members. Add
diagnostic for unmapped struct pointer members.
* testsuite/libgomp.c++/class-array-1.C: New test.
* testsuite/libgomp.c-c++-common/baseptrs-1.c: New test.
* testsuite/libgomp.c-c++-common/baseptrs-2.c: New test.
* testsuite/libgomp.c-c++-common/ptr-attach-2.c: New test.
* testsuite/libgomp.c++/baseptrs-3.C: New test.
* testsuite/libgomp.c++/baseptrs-4.C: New test.
* testsuite/libgomp.c++/baseptrs-5.C: New test.
* testsuite/libgomp.c++/target-48.C: New test.
* testsuite/libgomp.c++/target-49.C: New test.
* testsuite/libgomp.c/target-22.c: Add necessary explicit base pointer
mappings.
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Add XFAIL.
|
|
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct. It is also supported in defaultmap.
The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device. It can also be combined
with 'always' in map clauses.
2023-06-06 Kwok Cheung Yeung <kcy@codesourcery.com>
Tobias Burnus <tobias@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_omp_clause_defaultmap,
c_parser_omp_clause_map): Parse 'present'.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
(c_parser_omp_clause_from_to): New; parse to/from clauses with
optional present modifer.
(c_parser_omp_all_clauses): Update call.
(c_parser_omp_target_data, c_parser_omp_target_enter_data,
c_parser_omp_target_exit_data): Handle new map enum values
for 'present' mapping.
gcc/cp/
* parser.cc (cp_parser_omp_clause_defaultmap,
cp_parser_omp_clause_map): Parse 'present'.
(cp_parser_omp_clause_from_to): New; parse to/from
clauses with optional 'present' modifier.
(cp_parser_omp_all_clauses): Update call.
(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
cp_parser_omp_target_exit_data): Handle new enum value for
'present' mapping.
* semantics.cc (finish_omp_target): Likewise.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
modifiers.
(struct gfc_omp_namelist): Add 'present_modifer'.
* openmp.cc (gfc_match_motion_var_list): New, handles optional
'present' modifier for to/from clauses.
(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
in defaultmap and map clauses.
(resolve_omp_clauses): Allow 'present' modifiers on 'target',
'target data', 'target enter' and 'target exit' directives.
* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
to tree node for 'map', 'to' and 'from' clauses. Apply 'present' for
defaultmap.
gcc/
* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
set.
(omp_get_attachment): Handle map clauses with 'present' modifier.
(omp_group_base): Likewise.
(gimplify_scan_omp_clauses): Reorder present maps to come first.
Set GOVD flags for present defaultmaps.
(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
clauses.
(lower_omp_target): Handle map clauses with 'present' modifier.
Handle 'to' and 'from' clauses with 'present'.
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
'from' clauses with 'present' modifier. Handle present defaultmap.
* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
(GOMP_MAP_FLAG_FORCE): Redefine.
(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
(enum gomp_map_kind): Add map kinds with 'present' modifiers.
(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
map variants with 'present'
(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
for map variants with 'always, present' modifiers.
(GOMP_MAP_ALWAYS): Redefine.
(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.
libgomp/
* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
modifier.
(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
(gomp_map_vars_internal, gomp_update, gomp_target_rev):
Emit runtime error if memory region not present.
* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
* testsuite/libgomp.fortran/target-present-1.f90: New test.
* testsuite/libgomp.fortran/target-present-2.f90: New test.
* testsuite/libgomp.fortran/target-present-3.f90: New test.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
duplicated 'present' and extend scan-dump tests for 'present'.
* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
'present'.
* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
modifier checking.
* c-c++-common/gomp/defaultmap-4.c: New test.
* c-c++-common/gomp/map-9.c: New test.
* c-c++-common/gomp/target-update-1.c: New test.
* gfortran.dg/gomp/defaultmap-8.f90: New test.
* gfortran.dg/gomp/map-11.f90: New test.
* gfortran.dg/gomp/map-12.f90: New test.
* gfortran.dg/gomp/target-update-1.f90: New test.
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
(cherry picked from commit 4ede915d5dde935a16df2c6640aee5ab22348d30)
|
|
This reverts commit 6e3816fa47c56d87d21d90f67435a1ab1664d8db.
which then permits to apply the mainline patch of it more cleanly.
|
|
(forward ported from devel/omp/gcc-12)
This is a backport of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html
This patch implements '-fopenmp-target=acc', which enables internally handling
a subset of OpenMP target regions as OpenACC parallel regions. This basically
includes target, teams, parallel, distribute, for/do constructs, and atomics.
Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code
paths handle them, with various needed adjustments throughout middle-end and
nvptx backend. When using this "OMPACC" mode, if there are cases the patch
doesn't handle, it issues a warning, and reverts to normal processing for that
target region.
gcc/ChangeLog:
* builtins.cc (expand_builtin_omp_builtins): New function.
(expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER,
BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS,
BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using
expand_builtin_omp_builtins, enabled under -fopenmp-target=acc.
* cgraphunit.cc (analyze_functions): Add call to
omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc.
* common.opt (fopenmp-target=): Add new option and enums.
* config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=.
* config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New
prototype.
(nvptx_mem_shared_p): Likewise.
* config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX
symbol for number of threads in team.
(omp_num_threads_align): New var for alignment of omp_num_threads_sym.
(need_omp_num_threads): New bool for if any function references
omp_num_threads_sym.
(nvptx_option_override): Initialize omp_num_threads_sym/align.
(write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode.
(nvptx_declare_function_name): Disable shim function under OMPACC mode.
Disable soft-stack under OMPACC mode. Add generation of neutering init
code under OMPACC mode.
(nvptx_output_set_softstack): Return "" under OMPACC mode.
(nvptx_expand_call): Set parallelism to vector for function calls with
"ompacc for" attached.
(nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode.
(nvptx_expand_oacc_join): Likewise.
(nvptx_expand_omp_get_num_threads): New function.
(nvptx_mem_shared_p): New function.
(nvptx_mach_max_workers): Return 1 under OMPACC mode.
(nvptx_mach_vector_length): Return 32 under OMPACC mode.
(nvptx_single): Add adjustments for OMPACC mode, which have
parallel-construct fork/joins, and regions of code where neutering is
dynamically determined.
(nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for"
attribute is attached to function. Disable uniform-simt when under
OMPACC mode.
(nvptx_file_end): Write __nvptx_omp_num_threads out when needed.
(nvptx_goacc_fork_join): Return true under OMPACC mode.
* config/nvptx/nvptx.h (struct GTY(()) machine_function): Add
omp_parallel_predicate and omp_fn_entry_num_threads_reg fields.
* config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID,
UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID,
UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries.
(nvptx_shared_mem_operand): New predicate.
(gomp_barrier): New expand pattern.
(omp_get_num_threads): New expand pattern.
(omp_get_num_teams): New insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_team_num): Likewise.
(get_ntid): Likewise.
(nvptx_omp_parallel_fork): Likewise.
(nvptx_omp_parallel_join): Likewise.
* flag-types.h (omp_target_mode_kind): New flag value enum.
* gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_ctx_ompacc_p): New function.
(gimplify_omp_for): Handle combined loops under OMPACC.
* lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_.
* omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST.
(BUILT_IN_OMP_GET_NUM_THREADS): Likewise.
* omp-expand.cc (remove_exit_barrier): Disable addressable-var
processing for parallel construct child functions under OMPACC mode.
(expand_oacc_for): Add OMPACC mode handling.
(get_target_arguments): Force thread_limit clause value to 1 under
OMPACC mode.
(expand_omp): Under OMPACC mode, avoid child function expanding of
GIMPLE_OMP_PARALLEL.
* omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode.
* omp-low.cc (struct omp_context): Add 'bool ompacc_p' field.
(scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_.
(ompacc_ctx_p): New function.
(scan_omp_parallel): Handle OMPACC mode, avoid creating child function.
(scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target
construct child function, remove OMP_CLAUSE__OMPACC_ clauses.
(lower_oacc_head_mark): Handle OMPACC mode cases.
(lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add
vector/gang clauses as needed. Add other OMPACC handling.
(lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case.
(lower_omp_target): Do OpenACC gang privatization under OMPACC case.
(lower_omp_teams): Forward OpenACC privatization variables to outer
target region under OMPACC mode.
(lower_omp_1): Do OpenACC gang privatization under OMPACC case for
GIMPLE_BIND.
* omp-offload.cc (ompacc_supported_clauses_p): New function.
(struct target_region_data): New struct type for tree walk.
(scan_fndecl_for_ompacc): New function.
(scan_omp_target_region_r): New function.
(scan_omp_target_construct_r): New function.
(omp_ompacc_attribute_tagging): New function.
(oacc_dim_call): Add OMPACC case handling.
(execute_oacc_device_lower): Make parts explicitly only OpenACC enabled.
(pass_oacc_device_lower::gate): Enable pass under OMPACC mode.
* omp-offload.h (omp_ompacc_attribute_tagging): New prototype.
* opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp
and no -fopenacc.
* target-insns.def (gomp_barrier): New defined insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_num_threads): Likewise.
(omp_get_team_num): Likewise.
(omp_get_num_teams): Likewise.
* tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry
for internal clause.
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE__OMPACC_.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_.
* tree-ssa-loop.cc (pass_oacc_only::gate): Enable pass under OMPACC
mode cases.
libgomp/ChangeLog:
* config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in
shared memory.
(cherry picked from commit 5f881613fa9128edae5bbfa4e19f9752809e4bd7)
|
|
So far the implementation of the "omp tile" and "omp unroll"
directives restricted their use to the outermost loop of a loop-nest.
This commit changes the Fortran front end to parse and verify the
directives on inner loops. The transformation clauses are extended to
carry the information about the level of the loop-nest at which a
transformation should be applied. The middle end transformation pass
is adjusted to apply the transformations at the right level of a loop
nest and to take their effect on the loop nest depth into account.
gcc/fortran/ChangeLog:
* openmp.cc (omp_unroll_removes_loop_nest): Move down in file.
(resolve_loop_transform_generic): Remove, and ...
(resolve_omp_unroll): ... inline and adapt here. Move function.
Move functin.
(find_nested_loop_in_block): New function.
(find_nested_loop_in_chain): New function, used ...
(is_outer_iteration_variable): ... here, and ...
(expr_is_invariant): ... here.
(resolve_omp_do): Adjust code for resolving loop transformations.
(resolve_omp_tile): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Set OMP_TRANSFROM_LEVEL
on new clause.
(compute_transformed_depth): New function to compute the depth
("collapse") of a transformed loop nest, used
(gfc_trans_omp_do): ... here.
gcc/ChangeLog:
* omp-transform-loops.cc (gimple_assign_rhs_to_tree): Fix type
in comment.
(gomp_for_uncollapse): Adjust "collapse" value after uncollapse.
(partial_unroll): Add argument for the loop nest level to be transformed.
(tile): Likewise.
(transform_gomp_for): Pass level to transformatoin functions.
(optimize_transformation_clauses): Handle transformation clauses for all
levels recursively.
* tree-pretty-print.cc (dump_omp_clause): Print
OMP_CLAUSE_TRANSFORM_LEVEL for OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.cc: Increase number of operands of OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, and OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TRANSFORM_LEVEL): New macro to access
clause operand 0.
(OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Use operand 1 instead of 0.
(OMP_CLAUSE_TILE_SIZES): Likewise.
gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(cp_parser_omp_clause_unroll_partial): Likewise.
(cp_parser_omp_tile_sizes): Likewise.
(cp_parser_omp_loop_transform_clause): Likewise.
(cp_parser_omp_nested_loop_transform_clauses): Likewise.
(cp_parser_omp_unroll): Likewise.
* pt.cc (tsubst_omp_clauses): Adjust OMP_CLAUSE_UNROLL_PARTIAL
and OMP_CLAUSE_TILE handling to changed number of operands.
gcc/c/ChangeLog
* c-parser.cc (c_parser_omp_clause_unroll_full): Set new
OMP_CLAUSE_TRANSFORM_LEVEL operand to default value.
(c_parser_omp_clause_unroll_partial): Likewise.
(c_parser_omp_tile_sizes): Likewise.
(c_parser_omp_loop_transform_clause): Likewise.
(c_parser_omp_nested_loop_transform_clauses): Likewise.
(c_parser_omp_unroll): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/loop-transforms/unroll-8.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-9.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: Adjust.
* gfortran.dg/gomp/loop-transforms/inner-loops.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-imperfect-nest.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-3a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-4a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-inner-loops-5.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-inner-loop.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-inner-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: Adapt to
changed diagnostic messages.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/inner-1.f90: New test.
|
|
This commit implements the Fortran front end support for the "omp
tile" directive and the corresponding middle end transformation.
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE.
(enum gfc_exec_op): Add EXEC_OMP_TILE.
(loop_transform_p): New declaration.
(struct gfc_omp_clauses): Add "tile_sizes" field.
* dump-parse-tree.cc (show_omp_clauses): Handle "tile_sizes" dumping.
(show_omp_node): Handle EXEC_OMP_TILE.
(show_code_node): Likewise.
* match.h (gfc_match_omp_tile): New declaration.
* openmp.cc (gfc_free_omp_clauses): Free "tile_sizes" field.
(match_tile_sizes): New function.
(OMP_TILE_CLAUSES): New macro.
(gfc_match_omp_tile): New function.
(resolve_omp_do): Handle EXEC_OMP_TILE.
(resolve_omp_tile): New function.
(omp_code_to_statement): Handle EXEC_OMP_TILE.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Handle ST_OMP_END_TILE
and ST_OMP_TILE.
(next_statement): Handle ST_OMP_TILE.
(gfc_ascii_statement): Likewise.
(parse_omp_do): Likewise.
(parse_executable): Likewise.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_TILE.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle "tile_sizes" field.
(loop_transform_p): New function.
(gfc_expr_list_len): New function.
(gfc_trans_omp_do): Handle EXEC_OMP_TILE.
(gfc_trans_omp_directive): Likewise.
* trans.cc (trans_code): Likewise.
gcc/ChangeLog:
* gimplify.cc (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_TILE.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_loop): Likewise.
* omp-transform-loops.cc (walk_omp_for_loops): New declaration.
(subst_var_in_op): New function.
(subst_var): New function.
(gomp_for_number_of_iterations): Adjust.
(gomp_for_iter_count_type): New function.
(gimple_assign_rhs_to_tree): New function.
(subst_defs): New function.
(gomp_for_uncollapse): Adjust.
(transformation_clause_p): Add OMP_CLAUSE_TILE.
(tile): New function.
(transform_gomp_for): Handle OMP_CLAUSE_TILE.
(optimize_transformation_clauses): Handle OMP_CLAUSE_TILE.
* omp-general.cc (omp_loop_transform_clause_p): Add
OMP_CLAUSE_TILE.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_TILE.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_TILE.
* tree.cc: Add OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_SIZES): New macro.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-3.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/tile-unroll-4.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-tile-2.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/loop-transforms/tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-1a.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/tile-unroll-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-tile-2.f90: New test.
|
|
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation
construct "omp tile".
gcc/ChangeLog:
* tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE.
* tree.h (OMP_CLAUSE_TILE_LIST): Rename to ...
(OMP_CLAUSE_OACC_TILE_LIST): ... this.
(OMP_CLAUSE_TILE_ITERVAR): Rename to ...
(OMP_CLAUSE_OACC_TILE_ITERVAR): ... this.
(OMP_CLAUSE_TILE_COUNT): Rename to ...
(OMP_CLAUSE_OACC_TILE_COUNT): this.
* gimplify.cc (gimplify_scan_omp_clauses): Adjust to renamings.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_for): Likewise.
* omp-general.cc (omp_extract_for_data): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
(lower_oacc_head_mark): Likewise.
* tree-nested.cc (convert_nonlocal_omp_clauses): Likewise.
(convert_local_omp_clauses): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.cc: Likewise.
gcc/c-family/ChangeLog:
* c-omp.cc (c_oacc_split_loop_clauses): Adjust to renamings.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_collapse): Adjust to renamings.
(c_parser_oacc_clause_tile): Likewise.
(c_parser_omp_for_loop): Likewise.
* c-typeck.cc (c_finish_omp_clauses): Likewise.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_oacc_clause_tile): Adjust to renamings.
(cp_parser_omp_clause_collapse): Likewise.
(cp_parser_omp_for_loop): Likewise.
* pt.cc (tsubst_omp_clauses): Likewise.
* semantics.cc (finish_omp_clauses): Likewise.
(finish_omp_for): Likewise.
gcc/fortran/ChangeLog:
* openmp.cc (enum omp_mask2): Adjust to renamings.
(gfc_match_omp_clauses): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
|
|
This commit implements the OpenMP 5.1 "omp unroll" directive for
Fortran. The Fortran front end changes encompass the parsing and the
verification of nesting restrictions etc. The actual loop
transformation is implemented in a new language-independent
"omp_transform_loops" pass which runs before omp lowering. No attempt
is made to re-use existing unrolling optimizations because a separate
implementation allows for better control of the unrolling. The new
pass will also serve as a foundation for the implementation of further
OpenMP loop transformations. This commit only implements the support
for "omp unroll" on the outermost loop of a loop nest. The support
for inner loops will be added later.
gcc/ChangeLog:
* Makefile.in: Add omp_transform_loops.o.
* gimple-pretty-print.cc (dump_gimple_omp_for): Handle "full"
and "partial" clauses.
* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_TRANSFORM_LOOP.
* gimplify.cc (is_gimple_stmt): Handle OMP_UNROLL.
(gimplify_scan_omp_clauses): Handle OMP_UNROLL_FULL,
OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL.
(gimplify_adjust_omp_clauses): Handle OMP_UNROLL_FULL,
OMP_UNROLL_NONE, and OMP_UNROLL_PARTIAL.
(gimplify_omp_for): Handle OMP_UNROLL.
(gimplify_expr): Likewise.
* params.opt: Add omp-unroll-full-max-iteration and
omp-unroll-default-factor.
* passes.def: Add pass_omp_transform_loop before
pass_lower_omp.
* tree-core.h (enum omp_clause_code): Add
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_PARTIAL.
* tree-pass.h (make_pass_omp_transform_loops): Declare
pmake_pass_omp_transform_loops.
* tree-pretty-print.cc (dump_omp_clause): Handle
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_FULL, and
OMP_CLAUSE_UNROLL_PARTIAL.
(dump_generic_node): Handle OMP_UNROLL.
* tree.cc (omp_clause_num_ops): Add number of operators
for OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and
OMP_CLAUSE_UNROLL_PARTIAl.
(omp_clause_code_names): Add name strings for
OMP_CLAUSE_UNROLL_FULL, OMP_CLAUSE_UNROLL_NONE, and
OMP_CLAUSE_UNROLL_PARTIAL.
* tree.def (OMP_UNROLL): Define.
* tree.h (OMP_CLAUSE_UNROLL_PARTIAL_EXPR): Define.
* omp-transform-loops.cc: New file.
* omp-general.cc (omp_loop_transform_clause_p): New function.
* omp-general.h (omp_loop_transform_clause_p): New declaration.
gcc/fortran/ChangeLog:
* dump-parse-tree.cc (show_omp_clauses): Handle "unroll full"
and "unroll partial".
(show_omp_node): Handle OMP_UNROLL.
(show_code_node): Handle EXEC_OMP_UNROLL.
* gfortran.h (enum gfc_statement): Add ST_OMP_UNROLL, ST_OMP_END_UNROLL.
(enum gfc_exec_op): Add EXEC_OMP_UNROLL.
* match.h (gfc_match_omp_unroll): Declare.
* openmp.cc (enum omp_mask2): Add OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_NONE, OMP_CLAUSE_UNROLL_PARTIAL.
(gfc_match_omp_clauses): Handle "omp unroll partial".
(OMP_UNROLL_CLAUSES): New macro definition.
(gfc_match_omp_unroll): Match "full" clause.
(omp_unroll_removes_loop_nest): New function.
(resolve_omp_unroll): New function.
(resolve_omp_do): Accept and verify "omp unroll"
directives between directive and loop.
(omp_code_to_statement): Handle EXEC_OMP_UNROLL.
(gfc_resolve_omp_directive): Likewise.
* parse.cc (decode_omp_directive): Handle "undroll" and "end unroll".
(next_statement): Handle ST_OMP_UNROLL.
(gfc_ascii_statement): Handle ST_OMP_UNROLL and ST_OMP_END_UNROLL.
(parse_omp_do): Accept ST_OMP_UNROLL and ST_OMP_END_UNROLL
before/after loop.
(parse_executable): Handle ST_OMP_UNROLL.
* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_UNROLL.
(gfc_resolve_code): Likewise.
* st.cc (gfc_free_statement): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle unroll clauses.
(gfc_trans_omp_do): Handle OMP_CLAUSE_UNROLL_FULL,
OMP_CLAUSE_UNROLL_PARTIAL, OMP_CLAUSE_UNROLL_NONE creation.
(gfc_trans_omp_directive): Handle EXEC_OMP_UNROLL.
* trans.cc (trans_code): Likewise.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-2.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-3.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-4.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-5.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7a.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7b.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-7c.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-8.f90: New test.
* testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/loop-transforms/unroll-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-4.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-5.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-6.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-7.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-9.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-1.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-2.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-no-clause-3.f90: New test.
* gfortran.dg/gomp/loop-transforms/unroll-simd-1.f90: New test.
|
|
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct. It is also supported in defaultmap.
The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device. It can also be combined
with 'always' in map clauses.
2023-02-01 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_omp_variable_list): Set default motion
modifier.
(c_parser_omp_var_list_parens): Add new parameter with default. Parse
'present' motion modifier and apply.
(c_parser_omp_clause_defaultmap): Parse 'present' in defaultmap.
(c_parser_omp_clause_map): Parse 'present' modifier in map clauses.
(c_parser_omp_clause_to): Allow use of 'present' in variable list.
(c_parser_omp_clause_from): Likewise.
(c_parser_omp_target_data): Allow map clauses with 'present'
modifiers.
(c_parser_omp_target_enter_data): Likewise.
(c_parser_omp_target_exit_data): Likewise.
(c_parser_omp_target): Likewise.
gcc/cp/
* parser.cc (cp_parser_omp_var_list_no_open): Add new parameter with
default. Parse 'present' motion modifier and apply.
(cp_parser_omp_clause_defaultmap): Parse 'present' in defaultmap.
(cp_parser_omp_clause_map): Parse 'present' modifier in map clauses.
(cp_parser_omp_all_clauses): Allow use of 'present' in 'to' and 'from'
clauses.
(cp_parser_omp_target_data): Allow map clauses with 'present'
modifiers.
(cp_parser_omp_target_enter_data): Likewise.
(cp_parser_omp_target_exit_data): Likewise.
* semantics.cc (finish_omp_target): Accept map clauses with 'present'
modifiers.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
modifier.
(show_omp_clauses): Display 'present' motion modifier for 'to'
and 'from' clauses.
* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
modifiers.
(enum gfc_omp_motion_modifier): New.
(struct gfc_omp_namelist): Add motion_modifier field.
* openmp.cc (gfc_match_omp_variable_list): Add new parameter with
default. Parse 'present' motion modifier and apply.
(gfc_match_omp_clauses): Parse 'present' in defaultmap, 'from'
clauses, 'map' clauses and 'to' clauses.
(resolve_omp_clauses): Allow 'present' modifiers on 'target',
'target data', 'target enter' and 'target exit' directives.
* trans-openmp.cc (gfc_omp_deep_map_kind_p): Handle map kinds with
'present' modifier.
(gfc_trans_omp_clauses): Apply 'present' modifiers to tree node for
'map', 'to' and 'from' clauses. Apply 'present' for defaultmap.
gcc/
* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
set.
(omp_get_attachment): Handle map clauses with 'present' modifier.
(omp_group_base): Likewise.
(gimplify_scan_omp_clauses): Reorder present maps to come first.
Set GOVD flags for present defaultmaps.
(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
clauses.
(lower_omp_target): Handle map clauses with 'present' modifier.
Handle 'to' and 'from' clauses with 'present'.
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
(enum omp_clause_motion_modifier): New.
(struct tree_omp_clause): Add motion_modifier field.
* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
'from' clauses with 'present' modifier. Handle present defaultmap.
* tree.h (OMP_CLAUSE_MOTION_MODIFIER): New.
(OMP_CLAUSE_SET_MOTION_MODIFIER): New.
gcc/testsuite/
* c-c++-common/gomp/defaultmap-4.c: New.
* c-c++-common/gomp/map-6.c: Update expected error messages.
* c-c++-common/gomp/map-9.c: New.
* c-c++-common/gomp/target-update-1.c: New.
* gfortran.dg/gomp/defaultmap-1.f90: Update expected error messages.
* gfortran.dg/gomp/defaultmap-8.f90: New.
* gfortran.dg/gomp/map-10.f90: New.
* gfortran.dg/gomp/target-update-1.f90: New.
include/
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
(GOMP_MAP_FLAG_FORCE): Redefine.
(GOMP_MAP_FLAG_PRESENT): New.
(GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
(enum gomp_map_kind): Add map kinds with 'present' modifiers.
(GOMP_MAP_COPY_TO_P): Evaluate to true for map variants with 'present'
modifiers.
(GOMP_MAP_COPY_FROM_P): Likewise.
(GOMP_MAP_ALWAYS_TO_P): Evaluate to true for map variants with
'always, present' modifiers.
(GOMP_MAP_ALWAYS_FROM_P): Likewise.
(GOMP_MAP_ALWAYS): Redefine.
(GOMP_MAP_FORCE_P): New.
(GOMP_MAP_PRESENT_P): New.
libgomp/
* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
modifier.
(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
(gomp_map_vars_internal): Emit runtime error if memory region not
present.
(gomp_update): Likewise.
(gomp_target_rev): Likewise.
* testsuite/libgomp.c-c++-common/target-present-1.c: New.
* testsuite/libgomp.c-c++-common/target-present-2.c: New.
* testsuite/libgomp.c-c++-common/target-present-3.c: New.
* testsuite/libgomp.fortran/target-present-1.f90: New.
* testsuite/libgomp.fortran/target-present-2.f90: New.
* testsuite/libgomp.fortran/target-present-3.f90: New.
Add 'present' map types to gfc_omp_deep_map_kind_p
|
|
This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596412.html
For user defined allocator handles, this allows target regions to assign
memory space and traits to allocators, and automatically calls
omp_init/destroy_allocator() in the beginning/end of the target region.
For pre-defined allocators (i.e. omp_..._mem_alloc names), this is a no-op,
such clauses are not created.
Asides from the front-end portions, the target region transforms are
done in gimplify_omp_workshare.
This patch also includes added changes to enforce the "allocate allocator
must be also in a uses_allocator clause". This is done during
gimplify_scan_omp_clauses.
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_split_clauses): Add OMP_CLAUSE_USES_ALLOCATORS case.
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_USES_ALLOCATORS.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_name): Add case for uses_allocators
clause.
(c_parser_omp_clause_uses_allocators): New function.
(c_parser_omp_all_clauses): Add PRAGMA_OMP_CLAUSE_USES_ALLOCATORS case.
(OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_USES_ALLOCATORS to mask.
* c-typeck.cc (c_finish_omp_clauses): Add case handling for
OMP_CLAUSE_USES_ALLOCATORS.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_name): Add case for uses_allocators
clause.
(cp_parser_omp_clause_uses_allocators): New function.
(cp_parser_omp_all_clauses): Add PRAGMA_OMP_CLAUSE_USES_ALLOCATORS case.
(OMP_TARGET_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_USES_ALLOCATORS to mask.
* semantics.cc (finish_omp_clauses): Add case handling for
OMP_CLAUSE_USES_ALLOCATORS.
fortran/ChangeLog:
* gfortran.h (struct gfc_omp_namelist): Add memspace_sym, traits_sym
fields.
(OMP_LIST_USES_ALLOCATORS): New list enum.
* openmp.cc (enum omp_mask2): Add OMP_CLAUSE_USES_ALLOCATORS.
(gfc_match_omp_clause_uses_allocators): New function.
(gfc_match_omp_clauses): Add case to handle OMP_CLAUSE_USES_ALLOCATORS.
(OMP_TARGET_CLAUSES): Add OMP_CLAUSE_USES_ALLOCATORS.
(resolve_omp_clauses): Add "USES_ALLOCATORS" to clause_names[].
* dump-parse-tree.cc (show_omp_namelist): Handle OMP_LIST_USES_ALLOCATORS.
(show_omp_clauses): Likewise.
* trans-array.cc (gfc_conv_array_initializer): Adjust array index
to always be a created tree expression instead of NULL_TREE when zero.
* trans-openmp.cc (gfc_trans_omp_clauses): For ALLOCATE clause, handle
using gfc_trans_omp_variable for EXPR_VARIABLE exprs.
Add handling of OMP_LIST_USES_ALLOCATORS case.
* types.def (BT_FN_VOID_PTRMODE): Define.
(BT_FN_PTRMODE_PTRMODE_INT_PTR): Define.
gcc/ChangeLog:
* builtin-types.def (BT_FN_VOID_PTRMODE): Define.
(BT_FN_PTRMODE_PTRMODE_INT_PTR): Define.
* omp-builtins.def (BUILT_IN_OMP_INIT_ALLOCATOR): Define.
(BUILT_IN_OMP_DESTROY_ALLOCATOR): Define.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_USES_ALLOCATORS.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_USES_ALLOCATORS.
* tree.h (OMP_CLAUSE_USES_ALLOCATORS_ALLOCATOR): New macro.
(OMP_CLAUSE_USES_ALLOCATORS_MEMSPACE): New macro.
(OMP_CLAUSE_USES_ALLOCATORS_TRAITS): New macro.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE_USES_ALLOCATORS.
(omp_clause_code_name): Add "uses_allocators".
(walk_tree_1): Add OMP_CLAUSE_USES_ALLOCATORS case.
* gimplify.cc (gimplify_scan_omp_clauses): Add checking of OpenMP target
region allocate clauses, to require a uses_allocators clause to exist
for allocators.
(gimplify_omp_workshare): Add handling of OMP_CLAUSE_USES_ALLOCATORS
for OpenMP target regions; create calls of omp_init/destroy_allocator
around target region body.
* omp-low.cc (lower_private_allocate): Adjust receiving of allocator.
(lower_rec_input_clauses): Likewise.
(create_task_copyfn): Add dereference for allocator if needed.
* system.h (startswith): New function.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/uses_allocators-1.c: New test.
* c-c++-common/gomp/uses_allocators-2.c: New test.
* c-c++-common/gomp/uses_allocators-3.c: New test.
* gfortran.dg/gomp/allocate-1.f90: Adjust testcase.
* gfortran.dg/gomp/uses_allocators-1.f90: New test.
* gfortran.dg/gomp/uses_allocators-2.f90: New test.
* gfortran.dg/gomp/uses_allocators-3.f90: New test.
|
|
Currently we are only handling omp allocate directive that is associated
with an allocate statement. This statement results in malloc and free calls.
The malloc calls are easy to get to as they are in the same block as allocate
directive. But the free calls come in a separate cleanup block. To help any
later passes finding them, an allocate directive is generated in the
cleanup block with kind=free. The normal allocate directive is given
kind=allocate.
gcc/fortran/ChangeLog:
* gfortran.h (struct access_ref): Declare new members
omp_allocated and omp_allocated_end.
* openmp.cc (gfc_match_omp_allocate): Set new_st.resolved_sym to
NULL.
(prepare_omp_allocated_var_list_for_cleanup): New function.
(gfc_resolve_omp_allocate): Call it.
* trans-decl.cc (gfc_trans_deferred_vars): Process omp_allocated.
* trans-openmp.cc (gfc_trans_omp_allocate): Set kind for the stmt
generated for allocate directive.
gcc/ChangeLog:
* tree-core.h (struct tree_base): Add comments.
* tree-pretty-print.cc (dump_generic_node): Handle allocate directive
kind.
* tree.h (OMP_ALLOCATE_KIND_ALLOCATE): New define.
(OMP_ALLOCATE_KIND_FREE): Likewise.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-6.f90: Test kind of allocate directive.
|
|
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_clauses): Handle OMP_LIST_ALLOCATOR.
(gfc_trans_omp_allocate): New function.
(gfc_trans_omp_directive): Handle EXEC_OMP_ALLOCATE.
gcc/ChangeLog:
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_ALLOCATOR.
(dump_generic_node): Handle OMP_ALLOCATE.
* tree.def (OMP_ALLOCATE): New.
* tree.h (OMP_ALLOCATE_CLAUSES): Likewise.
(OMP_ALLOCATE_DECL): Likewise.
(OMP_ALLOCATE_ALLOCATOR): Likewise.
* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_ALLOCATOR.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-6.f90: New test.
|
|
This patch implements parsing for the OpenMP metadirective introduced in
OpenMP 5.0. Metadirectives are parsed into an OMP_METADIRECTIVE node,
with the variant clauses forming a chain accessible via
OMP_METADIRECTIVE_CLAUSES. Each clause contains the context selector
and tree for the variant.
User conditions in the selector are now permitted to be non-constant when
used in metadirectives as specified in OpenMP 5.1.
2021-01-25 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* omp-general.cc (omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New stub function.
* omp-general.h (struct omp_metadirective_variant): New.
(omp_context_selector_matches): Add extra argument.
(omp_resolve_metadirective): New prototype.
* tree.def (OMP_METADIRECTIVE): New.
* tree.h (OMP_METADIRECTIVE_CLAUSES): New macro.
gcc/c/
* c-parser.cc (c_parser_skip_to_end_of_block_or_statement): Handle
parentheses in statement.
(c_parser_omp_metadirective): New prototype.
(c_parser_omp_context_selector): Add extra argument. Allow
non-constant expressions.
(c_parser_omp_context_selector_specification): Add extra argument and
propagate it to c_parser_omp_context_selector.
(analyze_metadirective_body): New.
(c_parser_omp_metadirective): New.
(c_parser_omp_construct): Handle PRAGMA_OMP_METADIRECTIVE.
gcc/c-family/
* c-common.h (enum c_omp_directive_kind): Add C_OMP_DIR_META.
(c_omp_expand_metadirective): New prototype.
* c-gimplify.cc (genericize_omp_metadirective_stmt): New.
(c_genericize_control_stmt): Handle OMP_METADIRECTIVE tree nodes.
* c-omp.cc (omp_directives): Classify metadirectives as C_OMP_DIR_META.
(c_omp_expand_metadirective): New stub function.
* c-pragma.cc (omp_pragmas): Add entry for metadirective.
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_METADIRECTIVE.
|
|
2020-08-19 Sandra Loosemore <sandra@codesourcery.com>
gcc/
* tree.h (OACC_LOOP_COMBINED): New.
gcc/c/
* c-parser.cc (c_parser_oacc_loop): Set OACC_LOOP_COMBINED.
gcc/cp/
* parser.cc (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_do): Add combined parameter,
use it to set OACC_LOOP_COMBINED. Update all call sites.
|
|
gcc/
* gimplify.cc (privatize_reduction): New struct.
(localize_reductions_r, localize_reductions): New functions.
(gimplify_omp_for): Call localize_reductions.
(gimplify_omp_workshare): Likewise.
* omp-low.cc (lower_oacc_reductions): Handle localized reductions.
Create fewer temp vars.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_REDUCTION_PRIVATE_DECL
documentation.
* tree.cc (omp_clause_num_ops): Bump number of ops for
OMP_CLAUSE_REDUCTION to 6.
(walk_tree_1): Adjust accordingly.
* tree.h (OMP_CLAUSE_REDUCTION_PRIVATE_DECL): Add macro.
|
|
[PR109215]
Our documentation sadly talks about elt_type arr[0]; as zero-length arrays,
not arrays with zero elements. Unfortunately, those aren't the only arrays
which can have zero size, the same size can be also result of zero-length
element, like in GNU C struct whatever {} or in GNU C/C++ if the element
type is [0] array or combination thereof (dunno if Ada doesn't allow
something similar too). One can't do much with them, taking address of
their elements, (no-op) copying of the elements in and out. But they
behave differently from arr[0] arrays e.g. in that using non-zero indexes
in them (as long as they are within bounds as for normal arrays) is valid.
I think this naming inaccuracy resulted in Martin designing
special_array_member in an inconsistent way, mixing size zero array members
with array members of one or two or more elements and then using the
size zero interchangeably with zero elements.
The following patch changes that (but doesn't do any
documentation/diagnostics renaming, as this is really a corner case),
such that int_0/trail_0 for consistency is just about [0] arrays
plus [] for the latter, not one or more zero sized elements case.
The testcase has one xfailed case for where perhaps in later GCC versions
we could add extra code to handle it, for some reason we don't diagnose
out of bounds accesses for the zero sized elements cases. It will be
harder because e.g. FRE will canonicalize &var.fld[0] and &var.fld[10]
to just one of them because they are provably the same address.
But the important thing is to fix this regression (where we warn on
completely valid code in the Linux kernel). Anyway, for further work
on this we don't really need any extra help from special_array_member,
all code can just check integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (type))),
it doesn't depend on the position of the members etc.
2023-03-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109215
* tree.h (enum special_array_member): Adjust comments for int_0
and trail_0.
* tree.cc (component_ref_sam_type): Clear zero_elts if memtype
has zero sized element type and the array has variable number of
elements or constant one or more elements.
(component_ref_size): Adjust comments, formatting fix.
* gcc.dg/Wzero-length-array-bounds-3.c: New test.
|
|
The recent change to undo the tree_code_type/tree_code_length
excessive duplication apparently broke building the Linux kernel
plugin. While it is certainly desirable that GCC plugins are built
with the same compiler as GCC has been built and with the same options
(at least the important ones), it might be hard to arrange that,
e.g. if gcc is built using a cross-compiler but the plugin then built
natively, or GCC isn't bootstrapped for other reasons, or just as in
the kernel case they were building the plugin with -std=gnu++11 while
the bootstrapped GCC has been built without any such option and so with
whatever the compiler defaulted to.
For C++17 and later tree_code_{type,length} are UNIQUE symbols with
those assembler names, while for C++11/14 they were
_ZL14tree_code_type and _ZL16tree_code_length.
The following patch uses a comdat var for those even for C++11/14
as suggested by Maciej Cencora. Relying on weak attribute is not an
option because not all hosts support it and there are non-GNU system
compilers. While we could use it unconditionally,
I think defining a template just to make it comdat is weird, and
the compiler itself is always built with the same compiler.
Plugins, being separate shared libraries, will have a separate copy of
the arrays if they are ODR-used in the plugin, so there is not a big
deal if e.g. cc1plus uses tree_code_type while plugin uses
_ZN19tree_code_type_tmplILi0EE14tree_code_typeE or vice versa.
2023-03-10 Jakub Jelinek <jakub@redhat.com>
PR plugins/108634
* tree-core.h (tree_code_type, tree_code_length): For C++11 or
C++14, don't declare as extern const arrays.
(tree_code_type_tmpl, tree_code_length_tmpl): New types with
static constexpr member arrays for C++11 or C++14.
* tree.h (TREE_CODE_CLASS): For C++11 or C++14 use
tree_code_type_tmpl <0>::tree_code_type instead of tree_code_type.
(TREE_CODE_LENGTH): For C++11 or C++14 use
tree_code_length_tmpl <0>::tree_code_length instead of
tree_code_length.
* tree.cc (tree_code_type, tree_code_length): Remove.
|
|
Many functions defined in our headers are declared 'static inline' which
is a C idiom whose use predates our move to C++ as the implementation
language. But in C++ the inline keyword is more than just a compiler
hint, and is sufficient to give the function the intended semantics.
In fact declaring a function both static and inline is a pessimization
since static effectively disables the desired definition merging
behavior enabled by inline, and is also a source of (harmless) ODR
violations when a static inline function gets called from a non-static
inline one (such as tree_operand_check calling tree_operand_length).
This patch mechanically fixes the vast majority of occurrences of this
anti-pattern throughout the compiler's headers via the command line
sed -i 's/^static inline/inline/g' gcc/*.h gcc/*/*.h
There's also a manual change to remove the redundant declarations
of is_ivar and lookup_category in gcc/objc/objc-act.cc which would
otherwise conflict with their modified definitions in objc-act.h
(due to the difference in staticness).
Besides fixing some ODR violations, this speeds up stage1 cc1plus by
about 2% and reduces the size of its text segment by 1.5MB.
gcc/ChangeLog:
* addresses.h: Mechanically drop 'static' from 'static inline'
functions via s/^static inline/inline/g.
* asan.h: Likewise.
* attribs.h: Likewise.
* basic-block.h: Likewise.
* bitmap.h: Likewise.
* cfghooks.h: Likewise.
* cfgloop.h: Likewise.
* cgraph.h: Likewise.
* cselib.h: Likewise.
* data-streamer.h: Likewise.
* debug.h: Likewise.
* df.h: Likewise.
* diagnostic.h: Likewise.
* dominance.h: Likewise.
* dumpfile.h: Likewise.
* emit-rtl.h: Likewise.
* except.h: Likewise.
* expmed.h: Likewise.
* expr.h: Likewise.
* fixed-value.h: Likewise.
* gengtype.h: Likewise.
* gimple-expr.h: Likewise.
* gimple-iterator.h: Likewise.
* gimple-predict.h: Likewise.
* gimple-range-fold.h: Likewise.
* gimple-ssa.h: Likewise.
* gimple.h: Likewise.
* graphite.h: Likewise.
* hard-reg-set.h: Likewise.
* hash-map.h: Likewise.
* hash-set.h: Likewise.
* hash-table.h: Likewise.
* hwint.h: Likewise.
* input.h: Likewise.
* insn-addr.h: Likewise.
* internal-fn.h: Likewise.
* ipa-fnsummary.h: Likewise.
* ipa-icf-gimple.h: Likewise.
* ipa-inline.h: Likewise.
* ipa-modref.h: Likewise.
* ipa-prop.h: Likewise.
* ira-int.h: Likewise.
* ira.h: Likewise.
* lra-int.h: Likewise.
* lra.h: Likewise.
* lto-streamer.h: Likewise.
* memmodel.h: Likewise.
* omp-general.h: Likewise.
* optabs-query.h: Likewise.
* optabs.h: Likewise.
* plugin.h: Likewise.
* pretty-print.h: Likewise.
* range.h: Likewise.
* read-md.h: Likewise.
* recog.h: Likewise.
* regs.h: Likewise.
* rtl-iter.h: Likewise.
* rtl.h: Likewise.
* sbitmap.h: Likewise.
* sched-int.h: Likewise.
* sel-sched-ir.h: Likewise.
* sese.h: Likewise.
* sparseset.h: Likewise.
* ssa-iterators.h: Likewise.
* system.h: Likewise.
* target-globals.h: Likewise.
* target.h: Likewise.
* timevar.h: Likewise.
* tree-chrec.h: Likewise.
* tree-data-ref.h: Likewise.
* tree-iterator.h: Likewise.
* tree-outof-ssa.h: Likewise.
* tree-phinodes.h: Likewise.
* tree-scalar-evolution.h: Likewise.
* tree-sra.h: Likewise.
* tree-ssa-alias.h: Likewise.
* tree-ssa-live.h: Likewise.
* tree-ssa-loop-manip.h: Likewise.
* tree-ssa-loop.h: Likewise.
* tree-ssa-operands.h: Likewise.
* tree-ssa-propagate.h: Likewise.
* tree-ssa-sccvn.h: Likewise.
* tree-ssa.h: Likewise.
* tree-ssanames.h: Likewise.
* tree-streamer.h: Likewise.
* tree-switch-conversion.h: Likewise.
* tree-vectorizer.h: Likewise.
* tree.h: Likewise.
* wide-int.h: Likewise.
gcc/c-family/ChangeLog:
* c-common.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
gcc/c/ChangeLog:
* c-parser.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
gcc/cp/ChangeLog:
* cp-tree.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
gcc/fortran/ChangeLog:
* gfortran.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
gcc/jit/ChangeLog:
* jit-dejagnu.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
* jit-recording.h: Likewise.
gcc/objc/ChangeLog:
* objc-act.h: Mechanically drop static from static inline
functions via s/^static inline/inline/g.
* objc-map.h: Likewise.
* objc-act.cc: Remove the redundant redeclarations of is_ivar
and lookup_category.
|
|
This patch is an optimisation, but it's also a prerequisite for
fixing PR96373 without regressing vect-xorsign_exec.c.
Currently the vectoriser vectorises:
for (i = 0; i < N; i++)
r[i] = a[i] * __builtin_copysignf (1.0f, b[i]);
as two unconditional operations (copysign and mult).
tree-ssa-math-opts.cc later combines them into an "xorsign" function.
This works for both Advanced SIMD and SVE.
However, with the fix for PR96373, the vectoriser will instead
generate a conditional multiplication (IFN_COND_MUL). Something then
needs to fold copysign & IFN_COND_MUL to the equivalent of a conditional
xorsign. Three obvious options were:
(1) Extend tree-ssa-math-opts.cc.
(2) Do the fold in match.pd.
(3) Leave it to rtl combine.
I'm against (3), because this isn't a target-specific optimisation.
(1) would be possible, but would involve open-coding a lot of what
match.pd does for us. And, in contrast to doing the current
tree-ssa-math-opts.cc optimisation in match.pd, there should be
no danger of (2) happening too early. If we have an IFN_COND_MUL
then we're already past the stage of simplifying the original
source code.
There was also a choice between adding a conditional xorsign ifn
and simply open-coding the xorsign. The latter seems simpler,
and means less boiler-plate for target-specific code.
The signed_or_unsigned_type_for change is needed to make sure
that we stay in "SVE space" when doing the optimisation on 128-bit
fixed-length SVE.
gcc/
PR tree-optimization/96373
* tree.h (sign_mask_for): Declare.
* tree.cc (sign_mask_for): New function.
(signed_or_unsigned_type_for): For vector types, try to use the
related_int_vector_mode.
* genmatch.cc (commutative_op): Handle conditional internal functions.
* match.pd: Fold an IFN_COND_MUL+copysign into an IFN_COND_XOR+and.
gcc/testsuite/
PR tree-optimization/96373
* gcc.target/aarch64/sve/cond_xorsign_1.c: New test.
* gcc.target/aarch64/sve/cond_xorsign_2.c: Likewise.
|
|
|
|
As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find
if a floating point constant is zero and it shouldn't try to infer
equivalences from comparison against it if signed zeros are honored.
This doesn't work at all for decimal types, because real_zerop always
returns false for them (one can have different representations of decimal
zero beyond -0/+0), and it doesn't work for vector compares either,
as real_zerop checks if all elements are zero, while we need to avoid
infering equivalences from comparison against vector constants which have
at least one zero element in it (if signed zeros are honored).
Furthermore, as mentioned by Joseph, for decimal types many other values
aren't singleton.
So, this patch stops infering anything if element mode is decimal, and
otherwise uses instead of real_zerop a new function, real_maybe_zerop,
which will work even for decimal types and for complex or vector will
return true if any element is or might be zero (so it returns true
for anything but constants for now).
2022-12-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/108068
* tree.h (real_maybe_zerop): Declare.
* tree.cc (real_maybe_zerop): Define.
* tree-ssa-dom.cc (record_edge_info): Use it instead of
real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop. Always set
can_infer_simple_equiv to false for decimal floating point types.
* gcc.dg/dfp/pr108068.c: New test.
|
|
If the DECL_VALUE_EXPR of a VAR_DECL has EXPR_LOCATION set, then any use of
that variable looks like it has that location, which leads to the debugger
jumping back and forth for both lambdas and structured bindings.
Rather than fix all the uses, it seems simplest to remove any EXPR_LOCATION
when setting DECL_VALUE_EXPR. So the cp/ hunks aren't necessary, but they
avoid the need to unshare to remove the location.
PR c++/84471
PR c++/107504
gcc/cp/ChangeLog:
* coroutines.cc (transform_local_var_uses): Don't
specify a location for DECL_VALUE_EXPR.
* decl.cc (cp_finish_decomp): Likewise.
gcc/ChangeLog:
* fold-const.cc (protected_set_expr_location_unshare): Not static.
* tree.h: Declare it.
* tree.cc (decl_value_expr_insert): Use it.
include/ChangeLog:
* ansidecl.h (ATTRIBUTE_WARN_UNUSED_RESULT): Add __.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/value-expr1.C: New test.
* g++.dg/tree-ssa/value-expr2.C: New test.
* g++.dg/analyzer/pr93212.C: Move warning.
|
|
operations
This adds a new test-and-branch optab that can be used to do a conditional test
of a bit and branch. This is similar to the cbranch optab but instead can
test any arbitrary bit inside the register.
This patch recognizes boolean comparisons and single bit mask tests.
gcc/ChangeLog:
* dojump.cc (do_jump): Pass along value.
(do_jump_by_parts_greater_rtx): Likewise.
(do_jump_by_parts_zero_rtx): Likewise.
(do_jump_by_parts_equality_rtx): Likewise.
(do_compare_rtx_and_jump): Likewise.
(do_compare_and_jump): Likewise.
* dojump.h (do_compare_rtx_and_jump): New.
* optabs.cc (emit_cmp_and_jump_insn_1): Refactor to take optab to check.
(validate_test_and_branch): New.
(emit_cmp_and_jump_insns): Optiobally take a value, and when value is
supplied then check if it's suitable for tbranch.
* optabs.def (tbranch_eq$a4, tbranch_ne$a4): New.
* doc/md.texi (tbranch_@var{op}@var{mode}4): Document it.
* optabs.h (emit_cmp_and_jump_insns): New.
* tree.h (tree_zero_one_valued_p): New.
|
|
The following avoids CSE of &ps->wp to &ps->wp.hwnd confusing
-Wstringopt-overflow by making sure to produce addresses to the
biggest container from vectorization. For this I introduce
strip_zero_offset_components which turns &ps->wp.hwnd into
&(*ps) and use that to base the vector data references on.
That will also work for addresses with variable components,
alternatively emitting pointer arithmetic via calling
get_inner_reference and gimplifying that would be possible
but likely more intrusive.
This is by no means a complete fix for all of those issues
(avoiding ADDR_EXPRs in favor of pointer arithmetic might be).
Other passes will have similar issues.
In theory that might now cause false negatives.
PR tree-optimization/106904
* tree.h (strip_zero_offset_components): Declare.
* tree.cc (strip_zero_offset_components): Define.
* tree-vect-data-refs.cc (vect_create_addr_base_for_vector_ref):
Strip zero offset components before building the address.
* gcc.dg/Wstringop-overflow-pr106904.c: New testcase.
|
|
A. add the following to clarify the relationship between -Warray-bounds
and the LEVEL of -fstrict-flex-array:
By default, the trailing array of a structure will be treated as a
flexible array member by '-Warray-bounds' or '-Warray-bounds=N' if
it is declared as either a flexible array member per C99 standard
onwards ('[]'), a GCC zero-length array extension ('[0]'), or an
one-element array ('[1]'). As a result, out of bounds subscripts
or offsets into zero-length arrays or one-element arrays are not
warned by default.
You can add the option '-fstrict-flex-arrays' or
'-fstrict-flex-arrays=LEVEL' to control how this option treat
trailing array of a structure as a flexible array member.
when LEVEL<=1, no change to the default behavior.
when LEVEL=2, additional warnings will be issued for out of bounds
subscripts or offsets into one-element arrays;
when LEVEL=3, in addition to LEVEL=2, additional warnings will be
issued for out of bounds subscripts or offsets into zero-length
arrays.
B. change -Warray-bounds=2 to exclude its control on how to treat
trailing arrays as flexible array members:
'-Warray-bounds=2'
This warning level also warns about the intermediate results
of pointer arithmetic that may yield out of bounds values.
This warning level may give a larger number of false positives
and is deactivated by default.
gcc/ChangeLog:
* attribs.cc (strict_flex_array_level_of): New function.
* attribs.h (strict_flex_array_level_of): Prototype for new function.
* doc/invoke.texi: Update -Warray-bounds by specifying the impact from
-fstrict-flex-arrays. Also update -Warray-bounds=2 by eliminating its
impact on treating trailing arrays as flexible array members.
* gimple-array-bounds.cc (get_up_bounds_for_array_ref): New function.
(check_out_of_bounds_and_warn): New function.
(array_bounds_checker::check_array_ref): Update with call to the above
new functions.
* tree.cc (array_ref_flexible_size_p): Add one new argument.
(component_ref_sam_type): New function.
(component_ref_size): Control with level of strict-flex-array.
* tree.h (array_ref_flexible_size_p): Update prototype.
(enum struct special_array_member): Add two new enum values.
(component_ref_sam_type): New prototype.
gcc/c/ChangeLog:
* c-decl.cc (is_flexible_array_member_p): Call new function
strict_flex_array_level_of.
gcc/testsuite/ChangeLog:
* gcc.dg/Warray-bounds-11.c: Update warnings for -Warray-bounds=2.
* gcc.dg/Warray-bounds-flex-arrays-1.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-2.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-3.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-4.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-5.c: New test.
* gcc.dg/Warray-bounds-flex-arrays-6.c: New test.
|
|
This removes all uses of ASSERT_EXPR except the internal one in ipa-*.
gcc/ChangeLog:
* doc/gimple.texi: Remove ASSERT_EXPR references.
* fold-const.cc (tree_expr_nonzero_warnv_p): Same.
(fold_binary_loc): Same.
(tree_expr_nonnegative_warnv_p): Same.
* gimple-array-bounds.cc (get_base_decl): Same.
* gimple-pretty-print.cc (dump_unary_rhs): Same.
* gimple.cc (get_gimple_rhs_num_ops): Same.
* pointer-query.cc (handle_ssa_name): Same.
* tree-cfg.cc (verify_gimple_assign_single): Same.
* tree-pretty-print.cc (dump_generic_node): Same.
* tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):Same.
(interpret_rhs_expr): Same.
* tree-ssa-operands.cc (operands_scanner::get_expr_operands): Same.
* tree-ssa-propagate.cc
(substitute_and_fold_dom_walker::before_dom_children): Same.
* tree-ssa-threadedge.cc: Same.
* tree-vrp.cc (overflow_comparison_p): Same.
* tree.def (ASSERT_EXPR): Add note.
* tree.h (ASSERT_EXPR_VAR): Remove.
(ASSERT_EXPR_COND): Remove.
* vr-values.cc (simplify_using_ranges::vrp_visit_cond_stmt):
Remove comment.
|
|
The name of the utility routine "array_at_struct_end_p" is misleading
and should be changed to a new name that more accurately reflects its
real meaning.
The routine "array_at_struct_end_p" is used to check whether an array
reference is to an array whose actual size might be larger than its
upper bound implies, which includes 3 different cases:
A. a ref to a flexible array member at the end of a structure;
B. a ref to an array with a different type against the original decl;
C. a ref to an array that was passed as a parameter;
The old name only reflects the above case A, therefore very confusing
when reading the corresponding gcc source code.
In this patch, A new name "array_ref_flexible_size_p" is used to replace
the old name.
All the references to the routine "array_at_struct_end_p" was replaced
with this new name, and the corresponding comments were updated to make
them clean and consistent.
gcc/ChangeLog:
* gimple-array-bounds.cc (trailing_array): Replace
array_at_struct_end_p with new name and update comments.
* gimple-fold.cc (get_range_strlen_tree): Likewise.
* gimple-ssa-warn-restrict.cc (builtin_memref::builtin_memref):
Likewise.
* graphite-sese-to-poly.cc (bounds_are_valid): Likewise.
* tree-if-conv.cc (idx_within_array_bound): Likewise.
* tree-object-size.cc (addr_object_size): Likewise.
* tree-ssa-alias.cc (component_ref_to_zero_sized_trailing_array_p):
Likewise.
(stmt_kills_ref_p): Likewise.
* tree-ssa-loop-niter.cc (idx_infer_loop_bounds): Likewise.
* tree-ssa-strlen.cc (maybe_set_strlen_range): Likewise.
* tree.cc (array_at_struct_end_p): Rename to ...
(array_ref_flexible_size_p): ... this.
(component_ref_size): Replace array_at_struct_end_p with new name.
* tree.h (array_at_struct_end_p): Rename to ...
(array_ref_flexible_size_p): ... this.
|
|
C2x allows function prototypes to be given as (...), a prototype
meaning a variable-argument function with no named arguments. To
allow such functions to access their arguments, requirements for
va_start calls are relaxed so it ignores all but its first argument
(i.e. subsequent arguments, if any, can be arbitrary pp-token
sequences).
Implement this feature accordingly. The va_start relaxation in
<stdarg.h> is itself easy: __builtin_va_start already supports a
second argument of 0 instead of a parameter name, and calls get
converted internally to the form using 0 for that argument, so
<stdarg.h> just needs changing to use a variadic macro that passes 0
as the second argument of __builtin_va_start. (This is done only in
C2x mode, on the expectation that users of older standard would expect
unsupported uses of va_start to be diagnosed.)
For the (...) functions, it's necessary to distinguish these from
unprototyped functions, whereas previously C++ (...) functions and
unprototyped functions both used NULL TYPE_ARG_TYPES. A flag is added
to tree_type_common to mark the (...) functions; as discussed on gcc@,
doing things this way is likely to be safer for unchanged code in GCC
than adding a different form of representation in TYPE_ARG_TYPES, or
adding a flag that instead signals that the function is unprototyped.
There was previously an option
-fallow-parameterless-variadic-functions to enable support for (...)
prototypes. The support was incomplete - it treated the functions as
unprototyped, and only parsed some declarations, not e.g.
"int g (int (...));". This option is changed into a no-op ignored
option; (...) is always accepted syntactically, with a pedwarn_c11
call to given required diagnostics when appropriate. The peculiarity
of a parameter list with __attribute__ followed by '...' being
accepted with that option is removed.
Interfaces in tree.cc that create function types are adjusted to set
this flag as appropriate. It is of course possible that some existing
users of the functions to create variable-argument functions actually
wanted unprototyped functions in the no-named-argument case, rather
than functions with a (...) prototype; some such cases in c-common.cc
(for built-in functions and implicit function declarations) turn out
to need updating for that reason.
I didn't do anything to change how the C++ front end creates (...)
function types. It's very likely there are unchanged places in the
compiler that in fact turn out to need changes to work properly with
(...) function prototypes.
Target setup_incoming_varargs hooks, where they used the information
passed about the last named argument, needed updating to avoid using
that information in the (...) case. Note that apart from the x86
changes, I haven't done any testing of those target changes beyond
building cc1 to check for syntax errors. It's possible further
target-specific fixes will be needed; target maintainers should watch
out for failures of c2x-stdarg-4.c or c2x-stdarg-split-1a.c, the
execution tests, which would indicate that this feature is not working
correctly. Those tests also verify the case where there are named
arguments but the last named argument has a declaration that results
in undefined behavior in previous C standard versions, such as a type
changed by the default argument promotions.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
gcc/
* config/aarch64/aarch64.cc (aarch64_setup_incoming_varargs):
Check TYPE_NO_NAMED_ARGS_STDARG_P.
* config/alpha/alpha.cc (alpha_setup_incoming_varargs): Likewise.
* config/arc/arc.cc (arc_setup_incoming_varargs): Likewise.
* config/arm/arm.cc (arm_setup_incoming_varargs): Likewise.
* config/csky/csky.cc (csky_setup_incoming_varargs): Likewise.
* config/epiphany/epiphany.cc (epiphany_setup_incoming_varargs):
Likewise.
* config/fr30/fr30.cc (fr30_setup_incoming_varargs): Likewise.
* config/frv/frv.cc (frv_setup_incoming_varargs): Likewise.
* config/ft32/ft32.cc (ft32_setup_incoming_varargs): Likewise.
* config/i386/i386.cc (ix86_setup_incoming_varargs): Likewise.
* config/ia64/ia64.cc (ia64_setup_incoming_varargs): Likewise.
* config/loongarch/loongarch.cc
(loongarch_setup_incoming_varargs): Likewise.
* config/m32r/m32r.cc (m32r_setup_incoming_varargs): Likewise.
* config/mcore/mcore.cc (mcore_setup_incoming_varargs): Likewise.
* config/mips/mips.cc (mips_setup_incoming_varargs): Likewise.
* config/mmix/mmix.cc (mmix_setup_incoming_varargs): Likewise.
* config/nds32/nds32.cc (nds32_setup_incoming_varargs): Likewise.
* config/nios2/nios2.cc (nios2_setup_incoming_varargs): Likewise.
* config/riscv/riscv.cc (riscv_setup_incoming_varargs): Likewise.
* config/rs6000/rs6000-call.cc (setup_incoming_varargs): Likewise.
* config/sh/sh.cc (sh_setup_incoming_varargs): Likewise.
* config/visium/visium.cc (visium_setup_incoming_varargs):
Likewise.
* config/vms/vms-c.cc (vms_c_common_override_options): Do not set
flag_allow_parameterless_variadic_functions.
* doc/invoke.texi (-fallow-parameterless-variadic-functions): Do
not document option.
* function.cc (assign_parms): Call assign_parms_setup_varargs for
TYPE_NO_NAMED_ARGS_STDARG_P case.
* ginclude/stdarg.h [__STDC_VERSION__ > 201710L] (va_start): Make
variadic macro. Pass second argument of 0 to __builtin_va_start.
* target.def (setup_incoming_varargs): Update documentation.
* doc/tm.texi: Regenerate.
* tree-core.h (struct tree_type_common): Add
no_named_args_stdarg_p.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields): Unpack
TYPE_NO_NAMED_ARGS_STDARG_P.
* tree-streamer-out.cc (pack_ts_type_common_value_fields): Pack
TYPE_NO_NAMED_ARGS_STDARG_P.
* tree.cc (type_cache_hasher::equal): Compare
TYPE_NO_NAMED_ARGS_STDARG_P.
(build_function_type): Add argument no_named_args_stdarg_p.
(build_function_type_list_1, build_function_type_array_1)
(reconstruct_complex_type): Update calls to build_function_type.
(stdarg_p, prototype_p): Return true for (...) functions.
(gimple_canonical_types_compatible_p): Compare
TYPE_NO_NAMED_ARGS_STDARG_P.
* tree.h (TYPE_NO_NAMED_ARGS_STDARG_P): New.
(build_function_type): Update prototype.
gcc/c-family/
* c-common.cc (def_fn_type): Call build_function_type for
zero-argument variable-argument function.
(c_common_nodes_and_builtins): Build default_function_type with
build_function_type.
* c.opt (fallow-parameterless-variadic-functions): Mark as ignored
option.
gcc/c/
* c-decl.cc (grokdeclarator): Pass
arg_info->no_named_args_stdarg_p to build_function_type.
(grokparms): Check arg_info->no_named_args_stdarg_p before
converting () to (void).
(build_arg_info): Initialize no_named_args_stdarg_p.
(get_parm_info): Set no_named_args_stdarg_p.
(start_function): Pass TYPE_NO_NAMED_ARGS_STDARG_P to
build_function_type.
(store_parm_decls): Count (...) functions as prototyped.
* c-parser.cc (c_parser_direct_declarator): Allow '...' after open
parenthesis to start parameter list.
(c_parser_parms_list_declarator): Always allow '...' with no
arguments, call pedwarn_c11 and set no_named_args_stdarg_p.
* c-tree.h (struct c_arg_info): Add field no_named_args_stdarg_p.
* c-typeck.cc (composite_type): Handle
TYPE_NO_NAMED_ARGS_STDARG_P.
(function_types_compatible_p): Compare
TYPE_NO_NAMED_ARGS_STDARG_P.
gcc/fortran/
* trans-types.cc (gfc_get_function_type): Do not use
build_varargs_function_type_vec for unprototyped function.
gcc/lto/
* lto-common.cc (compare_tree_sccs_1): Compare
TYPE_NO_NAMED_ARGS_STDARG_P.
gcc/objc/
* objc-next-runtime-abi-01.cc (build_next_objc_exception_stuff):
Use build_function_type to build type of objc_setjmp_decl.
gcc/testsuite/
* gcc.dg/c11-stdarg-1.c, gcc.dg/c11-stdarg-2.c,
gcc.dg/c11-stdarg-3.c, gcc.dg/c2x-stdarg-1.c,
gcc.dg/c2x-stdarg-2.c, gcc.dg/c2x-stdarg-3.c,
gcc.dg/c2x-stdarg-4.c, gcc.dg/gnu2x-stdarg-1.c,
gcc.dg/torture/c2x-stdarg-split-1a.c,
gcc.dg/torture/c2x-stdarg-split-1b.c: New tests.
* gcc.dg/Wold-style-definition-2.c, gcc.dg/format/sentinel-1.c:
Update expected diagnostics.
* gcc.dg/c2x-nullptr-1.c (test5): Cast unused parameter to (void).
* gcc.dg/diagnostic-token-ranges.c: Use -pedantic. Expect warning
in place of error.
|
|
Simplify several calls to build_string_literal by not requiring redundant
strlen or IDENTIFIER_* in the caller.
I also corrected a wrong comment on IDENTIFIER_LENGTH.
gcc/ChangeLog:
* tree.h (build_string_literal): New one-argument overloads that
take tree (identifier) and const char *.
* builtins.cc (fold_builtin_FILE)
(fold_builtin_FUNCTION)
* gimplify.cc (gimple_add_init_for_auto_var)
* vtable-verify.cc (verify_bb_vtables): Simplify calls.
gcc/cp/ChangeLog:
* cp-gimplify.cc (fold_builtin_source_location)
* vtable-class-hierarchy.cc (register_all_pairs): Simplify calls to
build_string_literal.
(build_string_from_id): Remove.
|
|
Here is a complete patch to add std::bfloat16_t support on
x86 (AArch64 and ARM left for later). Almost no BFmode optabs
are added by the patch, so for binops/unops it extends to SFmode
first and then truncates back to BFmode.
For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations
of all those conversions so that we avoid double rounding, for
BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much
it emits BFmode -> SFmode conversion first and then converts to the even
wider mode, neither step should be imprecise.
For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion
and then SFmode -> HFmode, because neither format is subset or superset
of the other, while SFmode is superset of both.
expr.cc then contains a -ffast-math optimization of the BF -> SF and
SF -> BF conversions if we don't optimize for space (and for the latter
if -frounding-math isn't enabled either).
For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16
but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math
|| !flag_unsafe_math_optimizations, because I think the insn doesn't
raise on sNaNs, hardcodes round to nearest and flushes denormals to zero.
By default (unless x86 -fexcess-precision=16) we use float excess
precision for BFmode, so truncate only on explicit casts and assignments.
The patch introduces a single __bf16 builtin - __builtin_nansf16b,
because (__bf16) __builtin_nansf ("") will drop the sNaN into qNaN,
and uses f16b suffix instead of bf16 because there would be ambiguity on
log vs. logb - __builtin_logbf16 could be either log with bf16 suffix
or logb with f16 suffix. In other cases libstdc++ should mostly use
__builtin_*f for std::bfloat16_t overloads (we have a problem with
std::nextafter though but that one we have also for std::float16_t).
2022-10-14 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE.
* tree.h (bfloat16_type_node): Define.
* tree.cc (excess_precision_type): Promote bfloat16_type_mode
like float16_type_mode.
(build_common_tree_nodes): Initialize bfloat16_type_node if
BFmode is supported.
* expmed.h (maybe_expand_shift): Declare.
* expmed.cc (maybe_expand_shift): No longer static.
* expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF
conversions. If there is no optab, handle BF -> {DF,XF,TF,HF}
conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add
-ffast-math generic implementation for BF -> SF and SF -> BF
conversions.
* builtin-types.def (BT_BFLOAT16, BT_FN_BFLOAT16_CONST_STRING): New.
* builtins.def (BUILT_IN_NANSF16B): New builtin.
* fold-const-call.cc (fold_const_call): Handle CFN_BUILT_IN_NANSF16B.
* config/i386/i386.cc (classify_argument): Handle E_BCmode.
(ix86_libgcc_floating_mode_supported_p): Also return true for BFmode
for -msse2.
(ix86_mangle_type): Mangle BFmode as DF16b.
(ix86_invalid_conversion, ix86_invalid_unary_op,
ix86_invalid_binary_op): Remove.
(TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP,
TARGET_INVALID_BINARY_OP): Don't redefine.
* config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove.
(ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than
ix86_bf16_type_node, only create it if still NULL.
* config/i386/i386-builtin-types.def (BFLOAT16): Likewise.
* config/i386/i386.md (cbranchbf4, cstorebf4): New expanders.
gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node,
predefine __BFLT16_*__ macros and for C++23 also
__STDCPP_BFLOAT16_T__. Predefine bfloat16_type_node related
macros for -fbuilding-libgcc.
* c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16.
gcc/c/
* c-typeck.cc (convert_arguments): Don't promote __bf16 to
double.
gcc/cp/
* cp-tree.h (extended_float_type_p): Return true for
bfloat16_type_node.
* typeck.cc (cp_compare_floating_point_conversion_ranks): Set
extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bfloat16,
check_effective_target_bfloat16_runtime, add_options_for_bfloat16):
New.
* gcc.dg/torture/bfloat16-basic.c: New test.
* gcc.dg/torture/bfloat16-builtin.c: New test.
* gcc.dg/torture/bfloat16-builtin-issignaling-1.c: New test.
* gcc.dg/torture/bfloat16-complex.c: New test.
* gcc.dg/torture/builtin-issignaling-1.c: Allow to be includable
from bfloat16-builtin-issignaling-1.c.
* gcc.dg/torture/floatn-basic.h: Allow to be includable from
bfloat16-basic.c.
* gcc.target/i386/vect-bfloat16-typecheck_2.c: Adjust expected
diagnostics.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Likewise.
* gcc.target/i386/vect-bfloat16-typecheck_1.c: Likewise.
* g++.target/i386/bfloat_cpp_typecheck.C: Likewise.
libcpp/
* include/cpplib.h (CPP_N_BFLOAT16): Define.
* expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for
C++.
libgcc/
* config/i386/t-softfp (softfp_extensions): Add bfsf.
(softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf.
(CFLAGS-extendbfsf2.c, CFLAGS-truncsfbf2.c, CFLAGS-truncdfbf2.c,
CFLAGS-truncxfbf2.c, CFLAGS-trunctfbf2.c, CFLAGS-trunchfbf2.c): Add
-msse2.
* config/i386/libgcc-glibc.ver (GCC_13.0.0): Export
__extendbfsf2 and __trunc{s,d,x,t,h}fbf2.
* config/i386/sfp-machine.h (_FP_NANSIGN_B): Define.
* config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define.
* config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define.
* soft-fp/brain.h: New file.
* soft-fp/truncsfbf2.c: New file.
* soft-fp/truncdfbf2.c: New file.
* soft-fp/truncxfbf2.c: New file.
* soft-fp/trunctfbf2.c: New file.
* soft-fp/trunchfbf2.c: New file.
* soft-fp/truncbfhf2.c: New file.
* soft-fp/extendbfsf2.c: New file.
libiberty/
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment.
* cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t
entry.
(cplus_demangle_type): Demangle DF16b.
* testsuite/demangle-expected (_Z3xxxDF16b): New test.
|
|
Add the following new option -fstrict-flex-arrays[=n] and a corresponding
attribute strict_flex_array to GCC:
'-fstrict-flex-arrays'
Control when to treat the trailing array of a structure as a flexible array
member for the purpose of accessing the elements of such an array.
The positive form is equivalent to '-fstrict-flex-arrays=3', which is the
strictest. A trailing array is treated as a flexible array member only when
it declared as a flexible array member per C99 standard onwards.
The negative form is equivalent to '-fstrict-flex-arrays=0', which is the
least strict. All trailing arrays of structures are treated as flexible
array members.
'-fstrict-flex-arrays=LEVEL'
Control when to treat the trailing array of a structure as a flexible array
member for the purpose of accessing the elements of such an array. The value
of LEVEL controls the level of strictness
The possible values of LEVEL are the same as for the
'strict_flex_array' attribute (*note Variable Attributes::).
You can control this behavior for a specific trailing array field
of a structure by using the variable attribute 'strict_flex_array'
attribute (*note Variable Attributes::).
'strict_flex_array (LEVEL)'
The 'strict_flex_array' attribute should be attached to the trailing
array field of a structure. It controls when to treat the trailing array
field of a structure as a flexible array member for the purposes of accessing
the elements of such an array. LEVEL must be an integer betwen 0 to 3.
LEVEL=0 is the least strict level, all trailing arrays of
structures are treated as flexible array members. LEVEL=3 is the
strictest level, only when the trailing array is declared as a
flexible array member per C99 standard onwards ('[]'), it is
treated as a flexible array member.
There are two more levels in between 0 and 3, which are provided to
support older codes that use GCC zero-length array extension
('[0]') or one-element array as flexible array members('[1]'): When
LEVEL is 1, the trailing array is treated as a flexible array member
when it is declared as either '[]', '[0]', or '[1]'; When
LEVEL is 2, the trailing array is treated as a flexible array member
when it is declared as either '[]', or '[0]'.
This attribute can be used with or without the
'-fstrict-flex-arrays'. When both the attribute and the option
present at the same time, the level of the strictness for the
specific trailing array field is determined by the attribute.
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_strict_flex_array_attribute): New function.
(c_common_attribute_table): New item for strict_flex_array.
* c.opt: (fstrict-flex-arrays): New option.
(fstrict-flex-arrays=): New option.
gcc/c/ChangeLog:
* c-decl.cc (flexible_array_member_type_p): New function.
(one_element_array_type_p): Likewise.
(zero_length_array_type_p): Likewise.
(add_flexible_array_elts_to_size): Call new utility
routine flexible_array_member_type_p.
(is_flexible_array_member_p): New function.
(finish_struct): Set the new DECL_NOT_FLEXARRAY flag.
gcc/cp/ChangeLog:
* module.cc (trees_out::core_bools): Stream out new bit
decl_not_flexarray.
(trees_in::core_bools): Stream in new bit decl_not_flexarray.
gcc/ChangeLog:
* doc/extend.texi: Document strict_flex_array attribute.
* doc/invoke.texi: Document -fstrict-flex-arrays[=n] option.
* print-tree.cc (print_node): Print new bit decl_not_flexarray.
* tree-core.h (struct tree_decl_common): New bit field
decl_not_flexarray.
* tree-streamer-in.cc (unpack_ts_decl_common_value_fields): Stream
in new bit decl_not_flexarray.
* tree-streamer-out.cc (pack_ts_decl_common_value_fields): Stream
out new bit decl_not_flexarray.
* tree.cc (array_at_struct_end_p): Update it with the new bit field
decl_not_flexarray.
* tree.h (DECL_NOT_FLEXARRAY): New flag.
gcc/testsuite/ChangeLog:
* g++.dg/strict-flex-array-1.C: New test.
* gcc.dg/strict-flex-array-1.c: New test.
|
|
compiler part except for bfloat16 [PR106652]
The following patch implements the compiler part of C++23
P1467R9 - Extended floating-point types and standard names compiler part
by introducing _Float{16,32,64,128} as keywords and builtin types
like they are implemented for C already since GCC 7, with DF{16,32,64,128}_
mangling.
It also introduces _Float{32,64,128}x for C++ with the
https://github.com/itanium-cxx-abi/cxx-abi/pull/147
proposed mangling of DF{32,64,128}x.
The patch doesn't add anything for bfloat16_t support, as right now
__bf16 type refuses all conversions and arithmetic operations.
The patch wants to keep backwards compatibility with how __float128 has
been handled in C++ before, both for mangling and behavior in binary
operations, overload resolution etc. So, there are some backend changes
where for C __float128 and _Float128 are the same type (float128_type_node
and float128t_type_node are the same pointer), but for C++ they are distinct
types which mangle differently and _Float128 is treated as extended
floating-point type while __float128 is treated as non-standard floating
point type. The various C++23 changes about how floating-point types
are changed are actually implemented as written in the spec only if at least
one of the types involved is _Float{16,32,64,128,32x,64x,128x} (_FloatNx are
also treated as extended floating-point types) and kept previous behavior
otherwise. For float/double/long double the rules are actually written that
they behave the same as before.
There is some backwards incompatibility at least on x86 regarding _Float16,
because that type was already used by that name and with the DF16_ mangling
(but only since GCC 12 and I think it isn't that widely used in the wild
yet). E.g. config/i386/avx512fp16intrin.h shows the issues, where
in C or in GCC 12 in C++ one could pass 0.0f to a builtin taking _Float16
argument, but with the changes that is not possible anymore, one needs
to either use 0.0f16 or (_Float16) 0.0f.
We have also a problem with glibc headers, where since glibc 2.27
math.h and complex.h aren't compilable with these changes. One gets
errors like:
In file included from /usr/include/math.h:43,
from abc.c:1:
/usr/include/bits/floatn.h:86:9: error: multiple types in one declaration
86 | typedef __float128 _Float128;
| ^~~~~~~~~~
/usr/include/bits/floatn.h:86:20: error: declaration does not declare anything [-fpermissive]
86 | typedef __float128 _Float128;
| ^~~~~~~~~
In file included from /usr/include/bits/floatn.h:119:
/usr/include/bits/floatn-common.h:214:9: error: multiple types in one declaration
214 | typedef float _Float32;
| ^~~~~
/usr/include/bits/floatn-common.h:214:15: error: declaration does not declare anything [-fpermissive]
214 | typedef float _Float32;
| ^~~~~~~~
/usr/include/bits/floatn-common.h:251:9: error: multiple types in one declaration
251 | typedef double _Float64;
| ^~~~~~
/usr/include/bits/floatn-common.h:251:16: error: declaration does not declare anything [-fpermissive]
251 | typedef double _Float64;
| ^~~~~~~~
This is from snippets like:
/* The remaining of this file provides support for older compilers. */
# if __HAVE_FLOAT128
/* The type _Float128 exists only since GCC 7.0. */
# if !__GNUC_PREREQ (7, 0) || defined __cplusplus
typedef __float128 _Float128;
# endif
where it hardcodes that C++ doesn't have _Float{16,32,64,128,32x,64x,128x} support nor
{f,F}{16,32,64,128}{,x} literal suffixes nor _Complex _Float{16,32,64,128,32x,64x,128x}.
The patch fixincludes this for now and hopefully if this is committed, then
glibc can change those. The patch changes those
# if !__GNUC_PREREQ (7, 0) || defined __cplusplus
conditions to
# if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
Another thing is mangling, as said above, Itanium C++ ABI specifies
DF <number> _ as _Float{16,32,64,128} mangling, but GCC was implementing
a mangling incompatible with that starting with DF for fixed point types.
Fixed point was never supported in C++ though, I believe the reason why
the mangling has been added was that due to a bug it would leak into the
C++ FE through decltype (0.0r) etc. But that has been shortly after the
mangling was added fixed (I think in the same GCC release cycle), so we
now reject 0.0r etc. in C++. If we ever need the fixed point mangling,
I think it can be readded but better with a different prefix so that it
doesn't conflict with the published standard manglings. So, this patch
also kills the fixed point mangling and implements the DF <number> _
demangling.
The patch predefines __STDCPP_FLOAT{16,32,64,128}_T__ macros when
those types are available, but only for C++23, while the underlying types
are available in C++98 and later including the {f,F}{16,32,64,128} literal
suffixes (but those with a pedwarn for C++20 and earlier). My understanding
is that it needs to be predefined by the compiler, on the other side
predefining even for older modes when <stdfloat> is a new C++23 header
would be weird. One can find out if _Float{16,32,64,128,32x,64x,128x} is
supported in C++ by
__GNUC__ >= 13 && defined(__FLT{16,32,64,128,32X,64X,128X}_MANT_DIG__)
(but that doesn't work well with older G++ 13 snapshots).
As for std::bfloat16_t, three targets (aarch64, arm and x86) apparently
"support" __bf16 type which has the bfloat16 format, but isn't really
usable, e.g. {aarch64,arm,ix86}_invalid_conversion disallow any conversions
from or to type with BFmode, {aarch64,arm,ix86}_invalid_unary_op disallows
any unary operations on those except for ADDR_EXPR and
{aarch64,arm,ix86}_invalid_binary_op disallows any binary operation on
those. So, I think we satisfy:
"If the implementation supports an extended floating-point type with the
properties, as specified by ISO/IEC/IEEE 60559, of radix (b) of 2, storage
width in bits (k) of 16, precision in bits (p) of 8, maximum exponent (emax)
of 127, and exponent field width in bits (w) of 8, then the typedef-name
std::bfloat16_t is defined in the header <stdfloat> and names such a type,
the macro __STDCPP_BFLOAT16_T__ is defined, and the floating-point literal
suffixes bf16 and BF16 are supported."
because we don't really support those right now.
2022-09-27 Jakub Jelinek <jakub@redhat.com>
PR c++/106652
PR c++/85518
gcc/
* tree-core.h (enum tree_index): Add TI_FLOAT128T_TYPE
enumerator.
* tree.h (float128t_type_node): Define.
* tree.cc (build_common_tree_nodes): Initialize float128t_type_node.
* builtins.def (DEF_FLOATN_BUILTIN): Adjust comment now that
_Float<N> is supported in C++ too.
* config/i386/i386.cc (ix86_mangle_type): Only mangle as "g"
float128t_type_node.
* config/i386/i386-builtins.cc (ix86_init_builtin_types): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/i386/avx512fp16intrin.h (_mm_setzero_ph, _mm256_setzero_ph,
_mm512_setzero_ph, _mm_set_sh, _mm_load_sh): Use 0.0f16 instead of
0.0f.
* config/ia64/ia64.cc (ia64_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
* config/rs6000/rs6000-c.cc (is_float128_p): Also return true
for float128t_type_node if non-NULL.
* config/rs6000/rs6000.cc (rs6000_mangle_type): Don't mangle
float128_type_node as "u9__ieee128".
* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Use
float128t_type_node for __float128 instead of float128_type_node
and create it if NULL.
gcc/c-family/
* c-common.cc (c_common_reswords): Change _Float{16,32,64,128} and
_Float{32,64,128}x flags from D_CONLY to 0.
(shorten_binary_op): Punt if common_type returns error_mark_node.
(shorten_compare): Likewise.
(c_common_nodes_and_builtins): For C++ record _Float{16,32,64,128}
and _Float{32,64,128}x builtin types if available. For C++
clear float128t_type_node.
* c-cppbuiltin.cc (c_cpp_builtins): Predefine
__STDCPP_FLOAT{16,32,64,128}_T__ for C++23 if supported.
* c-lex.cc (interpret_float): For q/Q suffixes prefer
float128t_type_node over float128_type_node. Allow
{f,F}{16,32,64,128} suffixes for C++ if supported with pedwarn
for C++20 and older. Allow {f,F}{32,64,128}x suffixes for C++
with pedwarn. Don't call excess_precision_type for C++.
gcc/cp/
* cp-tree.h (cp_compare_floating_point_conversion_ranks): Implement
P1467R9 - Extended floating-point types and standard names except
for std::bfloat16_t for now. Declare.
(extended_float_type_p): New inline function.
* mangle.cc (write_builtin_type): Mangle float{16,32,64,128}_type_node
as DF{16,32,64,128}_. Mangle float{32,64,128}x_type_node as
DF{32,64,128}x. Remove FIXED_POINT_TYPE mangling that conflicts
with that.
* typeck2.cc (check_narrowing): If one of ftype or type is extended
floating-point type, compare floating-point conversion ranks.
* parser.cc (cp_keyword_starts_decl_specifier_p): Handle
CASE_RID_FLOATN_NX.
(cp_parser_simple_type_specifier): Likewise and diagnose missing
_Float<N> or _Float<N>x support if not supported by target.
* typeck.cc (cp_compare_floating_point_conversion_ranks): New function.
(cp_common_type): If both types are REAL_TYPE and one or both are
extended floating-point types, select common type based on comparison
of floating-point conversion ranks and subranks.
(cp_build_binary_op): Diagnose operation with floating point arguments
with unordered conversion ranks.
* call.cc (standard_conversion): For floating-point conversion, if
either from or to are extended floating-point types, set conv->bad_p
for implicit conversion from larger to smaller conversion rank or
with unordered conversion ranks.
(convert_like_internal): Emit a pedwarn on such conversions.
(build_conditional_expr): Diagnose operation with floating point
arguments with unordered conversion ranks.
(convert_arg_to_ellipsis): Don't promote extended floating-point types
narrower than double to double.
(compare_ics): Implement P1467R9 [over.ics.rank]/4 changes.
gcc/testsuite/
* g++.dg/cpp23/ext-floating1.C: New test.
* g++.dg/cpp23/ext-floating2.C: New test.
* g++.dg/cpp23/ext-floating3.C: New test.
* g++.dg/cpp23/ext-floating4.C: New test.
* g++.dg/cpp23/ext-floating5.C: New test.
* g++.dg/cpp23/ext-floating6.C: New test.
* g++.dg/cpp23/ext-floating7.C: New test.
* g++.dg/cpp23/ext-floating8.C: New test.
* g++.dg/cpp23/ext-floating9.C: New test.
* g++.dg/cpp23/ext-floating10.C: New test.
* g++.dg/cpp23/ext-floating.h: New file.
* g++.target/i386/float16-1.C: Adjust expected diagnostics.
libcpp/
* expr.cc (interpret_float_suffix): Allow {f,F}{16,32,64,128} and
{f,F}{32,64,128}x suffixes for C++.
include/
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
(struct demangle_component): Add u.s_extended_builtin member.
libiberty/
* cp-demangle.c (d_dump): Handle
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE. Don't handle
DEMANGLE_COMPONENT_FIXED_TYPE.
(d_make_extended_builtin_type): New function.
(cplus_demangle_builtin_types): Add _Float entry.
(cplus_demangle_type): For DF demangle it as _Float<N> or
_Float<N>x rather than fixed point which conflicts with it.
(d_count_templates_scopes): Handle
DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE. Just break; for
DEMANGLE_COMPONENT_FIXED_TYPE.
(d_find_pack): Handle DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
Don't handle DEMANGLE_COMPONENT_FIXED_TYPE.
(d_print_comp_inner): Likewise.
* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Bump.
* testsuite/demangle-expected: Replace _Z3xxxDFyuVb test
with _Z3xxxDF16_DF32_DF64_DF128_CDF16_Vb. Add
_Z3xxxDF32xDF64xDF128xCDF32xVb test.
fixincludes/
* inclhack.def (glibc_cxx_floatn_1, glibc_cxx_floatn_2,
glibc_cxx_floatn_3): New fixes.
* tests/base/bits/floatn.h: New file.
* fixincl.x: Regenerated.
|
|
The following patch implements part of the OpenMP 5.2 changes related
to ordered loops and with the assumed resolution of
https://github.com/OpenMP/spec/issues/3302 issues.
The changes are:
1) the depend clause on stand-alone ordered constructs has been renamed
to doacross (because depend clause has different syntax on other
constructs) with some syntax changes below, depend clause is deprecated
(we'll deprecate stuff on the GCC side only when we have everything else
from 5.2 implemented)
depend(source) -> doacross(source:) or doacross(source:omp_cur_iteration)
depend(sink:vec) -> doacross(sink:vec) (where vec has the same syntax
as before)
2) in 5.1 and before it has been significant whether ordered clause has or
doesn't have an argument, if it didn't, only block-associated ordered
could appear in the body, if it did, only stand-alone ordered could appear
in the body, all loops had to be perfectly nested, no associated
range-based for loops, no linear clause on work-sharing loop and ordered
clause with an argument wasn't allowed on composite for simd.
In 5.2, whether ordered clause has or doesn't have an argument is
insignificant (except for bugs in the standard, #3302 mentions those),
if the argument is missing, it is simply treated as equal to collapse
argument (if any, otherwise 1). The implementation better should be able
to differentiate between ordered and doacross loops at compile time
which previously was through the absence or presence of the argument,
now it is done through looking at the body of the construct lexically
and looking for stand-alone ordered constructs. If there are any,
it is to be handled as doacross loop, otherwise it is ordered loop
(but in that case ordered argument if present must be equal to collapse
argument - 5.2 says instead it must be one, but that is clearly wrong
and mentioned in #3302) - stand-alone ordered constructs must appear
lexically in the body (and had to before as well). For the restrictions
mentioned above, the for simd restriction is gone (stand-alone ordered
can't appear in simd construct, so that is enough), and the other rules
are expected to be changed into something related to presence of
stand-alone ordered constructs in the body
3) 5.2 allows a new syntax, doacross(sink:omp_cur_iteration-1), which
means wait for previous iteration in the iteration space of all the
associated loops
The following patch implements that, except that we sorry for now
on the doacross(sink:omp_cur_iteration-1) syntax during omp expansion
because library side isn't done yet for it. It doesn't implement it for
the Fortran FE either.
Incrementally, I'd like to change the way we differentiate between
stand-alone and block-associated ordered constructs, because the current
way of looking for presence of doacross clause doesn't work well if those
clauses are removed because they had been invalid (wrong syntax or
unknown variables in it etc.) and of course implement
doacross(sink:omp_cur_iteration-1).
2022-09-03 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DOACROSS.
(enum omp_clause_depend_kind): Remove OMP_CLAUSE_DEPEND_SOURCE
and OMP_CLAUSE_DEPEND_SINK, add OMP_CLAUSE_DEPEND_INVALID.
(enum omp_clause_doacross_kind): New type.
(struct tree_omp_clause): Add subcode.doacross_kind member.
* tree.h (OMP_CLAUSE_DEPEND_SINK_NEGATIVE): Remove.
(OMP_CLAUSE_DOACROSS_KIND): Define.
(OMP_CLAUSE_DOACROSS_SINK_NEGATIVE): Define.
(OMP_CLAUSE_DOACROSS_DEPEND): Define.
(OMP_CLAUSE_ORDERED_DOACROSS): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE_DOACROSS entries.
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
* tree-pretty-print.cc (dump_omp_clause): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK. Handle
OMP_CLAUSE_DOACROSS.
* gimplify.cc (gimplify_omp_depend): Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
(gimplify_scan_omp_clauses): Likewise. Handle OMP_CLAUSE_DOACROSS.
(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(find_standalone_omp_ordered): New function.
(gimplify_omp_for): When OMP_CLAUSE_ORDERED is present, search
body for OMP_ORDERED with OMP_CLAUSE_DOACROSS and if found,
set OMP_CLAUSE_ORDERED_DOACROSS.
(gimplify_omp_ordered): Don't handle OMP_CLAUSE_DEPEND_SINK or
OMP_CLAUSE_DEPEND_SOURCE, instead check OMP_CLAUSE_DOACROSS, adjust
diagnostics that presence or absence of ordered clause parameter
is irrelevant. Handle doacross(sink:omp_cur_iteration-1). Use
actual user name of the clause - doacross or depend - in diagnostics.
* omp-general.cc (omp_extract_for_data): Don't set fd->ordered
if !OMP_CLAUSE_ORDERED_DOACROSS (t). If
OMP_CLAUSE_ORDERED_DOACROSS (t) but !OMP_CLAUSE_ORDERED_EXPR (t),
set fd->ordered to -1 and set it after the loop in that case to
fd->collapse.
* omp-low.cc (check_omp_nesting_restrictions): Don't handle
OMP_CLAUSE_DEPEND_SOURCE nor OMP_CLAUSE_DEPEND_SINK, instead check
OMP_CLAUSE_DOACROSS. Use actual user name of the clause - doacross
or depend - in diagnostics. Diagnose mixing of stand-alone and
block associated ordered constructs binding to the same loop.
(lower_omp_ordered_clauses): Don't handle OMP_CLAUSE_DEPEND_SINK,
instead handle OMP_CLAUSE_DOACROSS.
(lower_omp_ordered): Look for OMP_CLAUSE_DOACROSS instead of
OMP_CLAUSE_DEPEND.
(lower_depend_clauses): Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK.
* omp-expand.cc (expand_omp_ordered_sink): Emit a sorry for
doacross(sink:omp_cur_iteration-1).
(expand_omp_ordered_source_sink): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE. Use actual user name of the clause
- doacross or depend - in diagnostics.
(expand_omp): Look for OMP_CLAUSE_DOACROSS clause instead of
OMP_CLAUSE_DEPEND.
(build_omp_regions_1): Likewise.
(omp_make_gimple_edges): Likewise.
* lto-streamer-out.cc (hash_tree): Handle OMP_CLAUSE_DOACROSS.
* tree-streamer-in.cc (unpack_ts_omp_clause_value_fields): Likewise.
* tree-streamer-out.cc (pack_ts_omp_clause_value_fields): Likewise.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DOACROSS.
* c-omp.cc (c_finish_omp_depobj): Check also for OMP_CLAUSE_DOACROSS
clause and diagnose it. Don't handle OMP_CLAUSE_DEPEND_SOURCE and
OMP_CLAUSE_DEPEND_SINK. Assert kind is not OMP_CLAUSE_DEPEND_INVALID.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Handle doacross.
(c_parser_omp_clause_depend_sink): Renamed to ...
(c_parser_omp_clause_doacross_sink): ... this. Add depend_p argument.
Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(c_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(c_parser_omp_clause_doacross): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(c_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(c_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(c_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
Don't handle OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Handle doacross.
(cp_parser_omp_clause_depend_sink): Renamed to ...
(cp_parser_omp_clause_doacross_sink): ... this. Add depend_p
argument. Handle parsing of doacross(sink:omp_cur_iteration-1). Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS instead
of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND flag on it.
(cp_parser_omp_clause_depend): Use OMP_CLAUSE_DOACROSS_SINK and
OMP_CLAUSE_DOACROSS_SOURCE instead of OMP_CLAUSE_DEPEND_SINK and
OMP_CLAUSE_DEPEND_SOURCE, build OMP_CLAUSE_DOACROSS for depend(source)
and set OMP_CLAUSE_DOACROSS_DEPEND on it.
(cp_parser_omp_clause_doacross): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_depobj): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
(cp_parser_omp_for_loop): Don't diagnose here linear clause together
with ordered with argument.
(cp_parser_omp_simd): Don't diagnose ordered clause with argument on
for simd.
(OMP_ORDERED_DEPEND_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_DOACROSS.
(cp_parser_omp_ordered): Handle also doacross and adjust for it
diagnostic wording.
* pt.cc (tsubst_omp_clause_decl): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE.
(tsubst_omp_clauses): Handle OMP_CLAUSE_DOACROSS.
(tsubst_expr): Use OMP_CLAUSE_DEPEND_INVALID instead of
OMP_CLAUSE_DEPEND_SOURCE.
* semantics.cc (cp_finish_omp_clause_depend_sink): Rename to ...
(cp_finish_omp_clause_doacross_sink): ... this.
(finish_omp_clauses): Handle OMP_CLAUSE_DOACROSS. Don't handle
OMP_CLAUSE_DEPEND_SOURCE and OMP_CLAUSE_DEPEND_SINK.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Use
OMP_CLAUSE_DOACROSS_SINK_NEGATIVE instead of
OMP_CLAUSE_DEPEND_SINK_NEGATIVE, build OMP_CLAUSE_DOACROSS
clause instead of OMP_CLAUSE_DEPEND and set OMP_CLAUSE_DOACROSS_DEPEND
on it.
gcc/testsuite/
* c-c++-common/gomp/doacross-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/doacross-5.c: New test.
* c-c++-common/gomp/doacross-6.c: New test.
* c-c++-common/gomp/nesting-2.c: Adjust expected diagnostics.
* c-c++-common/gomp/ordered-3.c: Likewise.
* c-c++-common/gomp/sink-3.c: Likewise.
* gfortran.dg/gomp/nesting-2.f90: Likewise.
|
|
Currently SSA_NAME_RANGE_INFO only handles integer ranges, and loses
half the precision in the process because its use of legacy
value_range's. This patch rewrites all the SSA_NAME_RANGE_INFO
(nonzero bits included) to use the recently contributed
vrange_storage. With it, we'll be able to efficiently save any ranges
supported by ranger in GC memory. Presently this will only be
irange's, but shortly we'll add floating ranges and others to the mix.
As per the discussion with the trailing_wide_ints adjustments and
vrange_storage, we'll be able to save integer ranges with a maximum of
5 sub-ranges. This could be adjusted later if more sub-ranges are
needed (unlikely).
Since this is a behavior changing patch, I would like to take a few
days for discussion, and commit early next week if all goes well.
A few notes.
First, we get rid of the SSA_NAME_ANTI_RANGE_P bit in the SSA_NAME
since we store full resolution ranges. Perhaps it could be re-used
for something else.
The range_info_def struct is gone in favor of an opaque type handled
by vrange_storage. It currently supports irange, but will support
frange, prange, etc, in due time.
From the looks of it, set_range_info was an update operation despite
its name, as we improved the nonzero bits with each call, even though
we clobbered the ranges. Presumably this was because doing a proper
intersect of ranges lost information with the anti-range hack. We no
longer have this limitation so now we formalize both set_range_info
and set_nonzero_bits to an update operation. After all, we should
never be losing information, but enhancing it whenever possible. This
means, that if folks' finger-memory is not offended, as a follow-up,
I'd like to rename set_nonzero_bits and set_range_info to update_*.
I have kept the same global API we had in tree-ssanames.h, with the
caveat that all set operations are now update as discussed above.
There is a 2% performance penalty for evrp and a 3% penalty for VRP
that is coincidentally in line with a previous improvement of the same
amount in the vrange abstraction patchset. Interestingly, this
penalty is mostly due to the wide int to tree dance we keep doing with
irange and legacy. In a first draft of this patch where I was
streaming trees directly, there was actually a small improvement
instead. I hope to get some of the gain back when we move irange's to
wide-ints, though I'm not in a hurry ;-).
Tested and benchmarked on x86-64 Linux. Tested on ppc64le Linux.
Comments welcome.
gcc/ChangeLog:
* gimple-range.cc (gimple_ranger::export_global_ranges): Remove
verification against legacy value_range.
(gimple_ranger::register_inferred_ranges): Same.
(gimple_ranger::export_global_ranges): Rename update_global_range
to set_range_info.
* tree-core.h (struct range_info_def): Remove.
(struct irange_storage_slot): New.
(struct tree_base): Remove SSA_NAME_ANTI_RANGE_P documentation.
(struct tree_ssa_name): Add vrange_storage support.
* tree-ssanames.cc (range_info_p): New.
(range_info_fits_p): New.
(range_info_alloc): New.
(range_info_free): New.
(range_info_get_range): New.
(range_info_set_range): New.
(set_range_info_raw): Remove.
(set_range_info): Adjust to use vrange_storage.
(set_nonzero_bits): Same.
(get_nonzero_bits): Same.
(duplicate_ssa_name_range_info): Remove overload taking
value_range_kind.
Rewrite tree overload to use vrange_storage.
(duplicate_ssa_name_fn): Adjust to use vrange_storage.
* tree-ssanames.h (struct range_info_def): Remove.
(set_range_info): Adjust prototype to take vrange.
* tree-vrp.cc (vrp_asserts::remove_range_assertions): Call
duplicate_ssa_name_range_info.
* tree.h (SSA_NAME_ANTI_RANGE_P): Remove.
(SSA_NAME_RANGE_TYPE): Remove.
* value-query.cc (get_ssa_name_range_info): Adjust to use
vrange_storage.
(update_global_range): Remove.
(get_range_global): Remove as_a<irange>.
* value-query.h (update_global_range): Remove.
* tree-ssa-dom.cc (set_global_ranges_from_unreachable_edges):
Rename update_global_range to set_range_info.
* value-range-storage.cc (vrange_storage::alloc_slot): Remove
gcc_unreachable.
|
|
When not optimizing, we can't do anything useful with unreachability in
terms of code performance, so we might as well improve debugging by turning
__builtin_unreachable into a trap. I think it also makes sense to do this
when we're explicitly optimizing for the debugging experience.
In the PR richi suggested introducing an -funreachable-traps flag for this.
This functionality is already implemented as -fsanitize=unreachable
-fsanitize-trap=unreachable, and we want to share the implementation, but it
does seem useful to have a separate flag that isn't affected by the various
sanitization controls. -fsanitize=unreachable takes priority over
-funreachable-traps if both are enabled.
Jakub observed that this would slow down -O0 by default from running the
sanopt pass, so this revision avoids the need for sanopt by rewriting calls
introduced by the compiler immediately, and calls written by the user at
fold time. Many of the calls introduced by the compiler are also rewritten
immediately to ubsan calls when not trapping, which fixes ubsan-8b.C;
previously the call to f() was optimized away before sanopt. But this early
rewriting isn't practical for uses of __builtin_unreachable in
devirtualization and such, so sanopt rewriting is still done for
non-trapping sanitize.
PR c++/104642
gcc/ChangeLog:
* common.opt: Add -funreachable-traps.
* doc/invoke.texi (-funreachable-traps): Document it.
* opts.cc (finish_options): Enable at -O0 or -Og.
* tree.cc (build_common_builtin_nodes): Add __builtin_trap.
(builtin_decl_unreachable, build_builtin_unreachable): New.
* tree.h: Declare them.
* ubsan.cc (sanitize_unreachable_fn): Factor out.
(ubsan_instrument_unreachable): Use
gimple_build_builtin_unreachable.
* ubsan.h (sanitize_unreachable_fn): Declare.
* gimple.cc (gimple_build_builtin_unreachable): New.
* gimple.h: Declare it.
* builtins.cc (expand_builtin_unreachable): Add assert.
(fold_builtin_0): Call build_builtin_unreachable.
* sanopt.cc: Don't run for just SANITIZE_RETURN
or SANITIZE_UNREACHABLE when trapping.
* cgraphunit.cc (walk_polymorphic_call_targets): Use new
unreachable functions.
* gimple-fold.cc (gimple_fold_call)
(gimple_get_virt_method_for_vtable)
* ipa-fnsummary.cc (redirect_to_unreachable)
* ipa-prop.cc (ipa_make_edge_direct_to_target)
(ipa_impossible_devirt_target)
* ipa.cc (walk_polymorphic_call_targets)
* tree-cfg.cc (pass_warn_function_return::execute)
(execute_fixup_cfg)
* tree-ssa-loop-ivcanon.cc (remove_exits_and_undefined_stmts)
(unloop_loops)
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt):
Likewise.
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_builtin_function_call): Handle
unreachable/trap earlier.
* cp-gimplify.cc (cp_maybe_instrument_return): Use
build_builtin_unreachable.
gcc/testsuite/ChangeLog:
* g++.dg/ubsan/return-8a.C: New test.
* g++.dg/ubsan/return-8b.C: New test.
* g++.dg/ubsan/return-8d.C: New test.
* g++.dg/ubsan/return-8e.C: New test.
|
|
The syntax for linear clause changed in 5.2, the original syntax
which is still valid is:
linear (var1, var2)
linear (var3, var4 : step1)
The 4.5 syntax with modifiers like:
linear (val (var5, var6))
linear (val (var7, var8) : step2)
is still supported in 5.2, but is deprecated there.
Instead, one can use a new syntax:
linear (var9, var10 : val)
linear (var11, var12 : step (step3), val)
As val, ref, uval or step (someexpr) can be valid expressions (and especially
in C++ can be const / constexpr / consteval), the spec says that
when the whole step expression is val (or ref or uval) or step ( ... )
then it is the new modifier syntax, one can use + 0 or 0 + or 1 * or * 1
or ()s to say it is the old step expression.
Also, 5.2 now allows val modifier to be specified even outside of declare simd
(but not the other modifiers). I've implemented this for the new modifier
syntax only, the old one keeps the old restriction (which is why
OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER flag has been introduced).
2022-06-07 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree.h (OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER): Define.
* tree-pretty-print.cc (dump_omp_clause) <case OMP_CLAUSE_LINEAR>:
Adjust clause printing style depending on
OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER.
gcc/c/
* c-parser.cc (c_parser_omp_clause_linear): Parse OpenMP 5.2
style linear clause modifiers. Set
OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER flag on the clauses when
old style modifiers are used.
* c-typeck.cc (c_finish_omp_clauses): Only reject linear clause
with val modifier on simd or for if the old style modifiers are
used.
gcc/cp/
* parser.cc (cp_parser_omp_clause_linear): Parse OpenMP 5.2
style linear clause modifiers. Set
OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER flag on the clauses when
old style modifiers are used.
* semantics.cc (finish_omp_clauses): Only reject linear clause
with val modifier on simd or for if the old style modifiers are
used.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Set
OMP_CLAUSE_LINEAR_OLD_LINEAR_MODIFIER on OMP_CLAUSE_LINEAR
clauses unconditionally for now.
gcc/testsuite/
* c-c++-common/gomp/linear-2.c: New test.
* c-c++-common/gomp/linear-3.c: New test.
* g++.dg/gomp/linear-3.C: New test.
* g++.dg/gomp/linear-4.C: New test.
* g++.dg/gomp/linear-5.C: New test.
|
|
OpenMP 5.1 and earlier had 2 different uses of to clause, one for target
update construct with one semantics, and one for declare target directive
with a different semantics.
Under the hood we were using OMP_CLAUSE_TO_DECLARE to represent the latter.
OpenMP 5.2 renamed the declare target clause to to enter, the old one is
kept as a deprecated alias.
As we are far from having full OpenMP 5.2 support, this patch adds support
for the enter clause (and renames OMP_CLAUSE_TO_DECLARE to OMP_CLAUSE_ENTER
with a flag to tell the spelling of the clause for better diagnostics),
but doesn't deprecate the to clause on declare target just yet (that
should be done as one of the last steps in 5.2 support).
2022-05-27 Jakub Jelinek <jakub@redhat.com>
gcc/
* tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TO_DECLARE
to OMP_CLAUSE_ENTER.
* tree.h (OMP_CLAUSE_ENTER_TO): Define.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Rename
OMP_CLAUSE_TO_DECLARE to OMP_CLAUSE_ENTER.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE_ENTER
instead of OMP_CLAUSE_TO_DECLARE, if OMP_CLAUSE_ENTER_TO, print
"to" instead of "enter".
* tree-nested.cc (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
gcc/c-family/
* c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_ENTER.
gcc/c/
* c-parser.cc (c_parser_omp_clause_name): Parse enter clause.
(c_parser_omp_all_clauses): For to clause on declare target, use
OMP_CLAUSE_ENTER clause with OMP_CLAUSE_ENTER_TO instead of
OMP_CLAUSE_TO_DECLARE clause. Handle PRAGMA_OMP_CLAUSE_ENTER.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add enter clause.
(c_parser_omp_declare_target): Use OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
* c-typeck.cc (c_finish_omp_clauses): Handle OMP_CLAUSE_ENTER instead
of OMP_CLAUSE_TO_DECLARE, to OMP_CLAUSE_ENTER_TO use "to" as clause
name in diagnostics instead of
omp_clause_code_name[OMP_CLAUSE_CODE (c)].
gcc/cp/
* parser.cc (cp_parser_omp_clause_name): Parse enter clause.
(cp_parser_omp_all_clauses): For to clause on declare target, use
OMP_CLAUSE_ENTER clause with OMP_CLAUSE_ENTER_TO instead of
OMP_CLAUSE_TO_DECLARE clause. Handle PRAGMA_OMP_CLAUSE_ENTER.
(OMP_DECLARE_TARGET_CLAUSE_MASK): Add enter clause.
(cp_parser_omp_declare_target): Use OMP_CLAUSE_ENTER instead of
OMP_CLAUSE_TO_DECLARE.
* semantics.cc (finish_omp_clauses): Handle OMP_CLAUSE_ENTER instead
of OMP_CLAUSE_TO_DECLARE, to OMP_CLAUSE_ENTER_TO use "to" as clause
name in diagnostics instead of
omp_clause_code_name[OMP_CLAUSE_CODE (c)].
gcc/testsuite/
* c-c++-common/gomp/clauses-3.c: Add tests with enter clause instead
of to or modify some existing to clauses to enter.
* c-c++-common/gomp/declare-target-1.c: Likewise.
* c-c++-common/gomp/declare-target-2.c: Likewise.
* c-c++-common/gomp/declare-target-3.c: Likewise.
* g++.dg/gomp/attrs-9.C: Likewise.
* g++.dg/gomp/declare-target-1.C: Likewise.
libgomp/
* testsuite/libgomp.c-c++-common/target-40.c: Modify some existing to
clauses to enter.
* testsuite/libgomp.c/target-41.c: Likewise.
|
|
tree.h already contains combined_fn handling at the top and moving
code_helper away from gimple-match.h makes improving the gimple_build
API easier.
2022-05-16 Richard Biener <rguenther@suse.de>
* gimple-match.h (code_helper): Move class ...
* tree.h (code_helper): ... here.
|
|
This patch implements a simple tree wrapper, named tree_vec_range, which
lets us idiomatically loop over all the elements of a TREE_VEC using a
C++11 range-based for loop:
// v is a TREE_VEC
for (tree e : tree_vec_range (v))
...
This is similar to the existing tree-based range adaptors ovl_range and
lkp_range added to the C++ FE in r12-340-g3307b9a07a3c51.
This patch also converts some existing loops over TREE_VEC within the
C++ FE to use tree_vec_range / range-for.
gcc/cp/ChangeLog:
* constraint.cc (tsubst_parameter_mapping): Convert loop over
TREE_VEC into a range-based for loop using tree_vec_range.
* pt.cc (iterative_hash_template_arg): Likewise.
(template_parms_level_to_args): Likewise.
(deducible_template_args): Likewise.
(check_undeduced_parms): Likewise.
(dependent_type_p_r): Likewise.
(value_dependent_expression_p) <case NONTYPE_ARGUMENT_PACK>:
Likewise.
(dependent_template_arg_p): Likewise.
* tree.cc (cp_walk_subtrees) <case NONTYPE_ARGUMENT_PACK>:
Likewise.
gcc/ChangeLog:
* tree.h (TREE_VEC_BEGIN): Define.
(TREE_VEC_END): Correct 'length' member access.
(class tree_vec_range): Define.
|
|
The following removes the indirection to real_value from REAL_CST
which doesn't seem to serve any useful purpose. Any sharing can
be achieved by sharing the actual REAL_CST (which is what usually
happens when copying trees) and sharing of real_value amongst
different REAL_CST doesn't happen as far as I can see and would
only lead to further issues like mismatching type and real_value.
2022-04-27 Richard Biener <rguenther@suse.de>
* tree-core.h (tree_real_cst::real_cst_ptr): Remove pointer
to real_value field.
(tree_real_cst::value): Add real_value field.
* tree.h (TREE_REAL_CST_PTR): Adjust.
* tree.cc (build_real): Remove separate allocation.
* tree-streamer-in.cc (unpack_ts_real_cst_value_fields):
Likewise.
gcc/cp/
* module.cc (trees_in::core_vals): Remove separate allocation
for REAL_CST.
|
|
gcc/jit/
PR jit/104071
* docs/_build/texinfo/libgccjit.texi: Regenerate.
* docs/topics/compatibility.rst (LIBGCCJIT_ABI_21): New ABI tag.
* docs/topics/expressions.rst: Add documentation for the
function gcc_jit_context_new_bitcast.
* jit-playback.cc: New function (new_bitcast).
* jit-playback.h: New function (new_bitcast).
* jit-recording.cc: New functions (new_bitcast,
bitcast::replay_into, bitcast::visit_children,
bitcast::make_debug_string, bitcast::write_reproducer).
* jit-recording.h: New class (bitcast) and new function
(new_bitcast, bitcast::replay_into, bitcast::visit_children,
bitcast::make_debug_string, bitcast::write_reproducer,
bitcast::get_precedence).
* libgccjit.cc: New function (gcc_jit_context_new_bitcast)
* libgccjit.h: New function (gcc_jit_context_new_bitcast)
* libgccjit.map (LIBGCCJIT_ABI_21): New ABI tag.
gcc/testsuite/
PR jit/104071
* jit.dg/all-non-failing-tests.h: Add new test-bitcast.
* jit.dg/test-bitcast.c: New test.
* jit.dg/test-error-bad-bitcast.c: New test.
* jit.dg/test-error-bad-bitcast2.c: New test.
gcc/
PR jit/104071
* toplev.cc: Call the new function tree_cc_finalize in
toplev::finalize.
* tree.cc: New functions (clear_nonstandard_integer_type_cache
and tree_cc_finalize) to clear the cache of non-standard integer
types to avoid having issues with some optimizations of
bitcast where the SSA_NAME will have a size of a cached
integer type that should have been invalidated, causing a
comparison of integer constant to fail.
* tree.h: New function (tree_cc_finalize).
|
|
This patch fixes a wrong -Wimplicit-fallthrough warning for
case 0:
if (1) // wrong may fallthrough
return 0;
case 1:
which in .gimple looks like
<D.1981>: // case 0
if (1 != 0) goto <D.1985>; else goto <D.1986>;
<D.1985>:
D.1987 = 0;
// predicted unlikely by early return (on trees) predictor.
return D.1987;
<D.1986>: // dead
<D.1982>: // case 1
and the warning thinks that <D.1986>: falls through to <D.1982>:. It
does not know that <D.1986> is effectively a dead label, only reachable
through fallthrough from previous instructions, never jumped to. To
that effect, Jakub introduced UNUSED_LABEL_P, which is set on such dead
labels.
collect_fallthrough_labels has code to deal with cases like
case 2:
if (e != 10)
i++; // this may fallthru, warn
else
return 44;
case 3:
which collects labels that may fall through. Here it sees the "goto <D.1990>;"
at the end of the then branch and so when the warning reaches
...
<D.1990>: // from if-then
<D.1984>: // case 3
it knows it should warn about the possible fallthrough. But an UNUSED_LABEL_P
is not a label that can fallthrough like that, so it should ignore those.
However, we still want to warn about this:
case 0:
if (1)
n++; // falls through
case 1:
so collect_fallthrough_labels needs to return the "n = n + 1;" statement, rather
than the dead label.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
PR middle-end/103597
gcc/ChangeLog:
* gimplify.cc (collect_fallthrough_labels): Don't push UNUSED_LABEL_Ps
into labels. Maybe set prev to the statement preceding UNUSED_LABEL_P.
(gimplify_cond_expr): Set UNUSED_LABEL_P.
* tree.h (UNUSED_LABEL_P): New.
gcc/testsuite/ChangeLog:
* c-c++-common/Wimplicit-fallthrough-39.c: New test.
|
|
gcc/ChangeLog:
* tree.h (IDENTIFIER_LENGTH): Add comment.
|
|
lowering [PR100280]
... in preparation for later changes. No functional change.
Follow-up to commit 9b32c1669aad5459dd053424f9967011348add83
"OpenACC 'kernels' decomposition: Mark variables used in
synthesized data clauses as addressable [PR100280]".
PR middle-end/100280
gcc/
* tree.h (OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE): New.
* tree-core.h: Document it.
* omp-low.cc (scan_sharing_clauses) <OMP_CLAUSE_MAP>: Handle
'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'.
* omp-oacc-kernels-decompose.cc (maybe_build_inner_data_region):
Set 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' instead of
'TREE_ADDRESSABLE'.
gcc/testsuite/
* c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/kernels-decompose-2.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr100280-1.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/f-asyncwait-1.c: Adjust.
* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
Likewise.
|
|
PR ipa/104533
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_target_clones_attribute): Use
get_target_clone_attr_len and report warning soon.
gcc/ChangeLog:
* multiple_target.cc (get_attr_len): Move to tree.c.
(expand_target_clones): Remove single value checking.
* tree.cc (get_target_clone_attr_len): New fn.
* tree.h (get_target_clone_attr_len): Likewise.
gcc/testsuite/ChangeLog:
* g++.target/i386/pr104533.C: New test.
|
|
This adds a flag to CONSTRUCTOR nodes indicating that for
clobbers this marks the end-of-life of storage as opposed to
just ending the lifetime of the object that occupied it.
The dangling pointer diagnostics uses CLOBBERs but is confused
by those emitted by the C++ frontend for example which emits
them for the second purpose at the start of CTORs. The issue
is also appearant for aarch64 in PR104092.
Distinguishing the two cases is also necessary for the PR90348 fix.
Since I'm going to add another flag I added an enum clobber_flags
and a defaulted argument to build_clobber plus a convenient way to
query the enum from the CTOR tree and specify it for gimple_clobber_p.
Since 'CLOBBER' is already taken and I needed a name for the unspecified
clobber we have now I used 'CLOBBER_UNDEF'.
2022-02-03 Richard Biener <rguenther@suse.de>
PR middle-end/90348
PR middle-end/104092
gcc/
* tree-core.h (clobber_kind): New enum.
(tree_base::u::bits::address_space): Document use in CONSTRUCTORs.
* tree.h (CLOBBER_KIND): Add.
(build_clobber): Add clobber kind argument, defaulted to
CLOBBER_UNDEF.
* tree.cc (build_clobber): Likewise.
* gimple.h (gimple_clobber_p): New overload with specified kind.
* tree-streamer-in.cc (streamer_read_tree_bitfields): Stream
CLOBBER_KIND.
* tree-streamer-out.cc (streamer_write_tree_bitfields):
Likewise.
* tree-pretty-print.cc (dump_generic_node): Mark EOL CLOBBERs.
* gimplify.cc (gimplify_bind_expr): Build storage end-of-life clobbers
with CLOBBER_EOL.
(gimplify_target_expr): Likewise.
* tree-inline.cc (expand_call_inline): Likewise.
* tree-ssa-ccp.cc (insert_clobber_before_stack_restore): Likewise.
* gimple-ssa-warn-access.cc (pass_waccess::check_stmt): Only treat
CLOBBER_EOL clobbers as ending lifetime of storage.
gcc/lto/
* lto-common.cc (compare_tree_sccs_1): Compare CLOBBER_KIND.
gcc/testsuite/
* gcc.dg/pr87052.c: Adjust.
|
|
gcc/ChangeLog:
* tree.h (struct tree_vec_map_cache_hasher): Move from...
* tree.cc (struct tree_vec_map_cache_hasher): ...here.
|