aboutsummaryrefslogtreecommitdiff
path: root/gcc/doc
diff options
context:
space:
mode:
authorAlexander Monakov <amonakov@ispras.ru>2016-11-16 20:17:00 +0300
committerAlexander Monakov <amonakov@gcc.gnu.org>2016-11-16 20:17:00 +0300
commit5012919d0bd344ac1888e8e531072f0ccbe24d2c (patch)
tree9db609d99ee4957a92a3ad468eb36d855e6c1bc6 /gcc/doc
parent2fe2aba3cd7c2daf16c545bc7fa34481157bfcaf (diff)
downloadgcc-5012919d0bd344ac1888e8e531072f0ccbe24d2c.zip
gcc-5012919d0bd344ac1888e8e531072f0ccbe24d2c.tar.gz
gcc-5012919d0bd344ac1888e8e531072f0ccbe24d2c.tar.bz2
nvptx backend prerequisites for OpenMP offloading
gcc/ * config/nvptx/mkoffload.c (main): Check that either OpenACC or OpenMP is selected. Pass -mgomp to offload compiler in OpenMP case. * config/nvptx/nvptx-protos.h (nvptx_shuffle_kind): Move enum declaration from nvptx.c. (nvptx_gen_shuffle): Declare. (nvptx_output_set_softstack): Declare. * config/nvptx/nvptx.c (nvptx_shuffle_kind): Move to nvptx-protos.h. (need_softstack_decl): New variable. (need_unisimt_decl): New variable. (diagnose_openacc_conflict): New. Use it... (nvptx_option_override): ...here. Handle TARGET_GOMP. (nvptx_encode_section_info): Handle "shared" attribute. (write_as_kernel): Restrict to OpenACC target regions. (init_softstack_frame): New. (nvptx_init_unisimt_predicate): New. (write_omp_entry): New. Use it... (nvptx_declare_function_name): ...here to emit OpenMP target region entrypoints. Handle TARGET_SOFT_STACK. Call nvptx_init_unisimt_predicate. (nvptx_output_set_softstack): New. (nvptx_get_drap_rtx): Return %argp as the DRAP if needed. (nvptx_gen_shuffle): Export. (nvptx_output_call_insn): Handle COND_EXEC patterns. Emit instruction predicate. (nvptx_print_operand): Fix handling of instruction predicates. (nvptx_get_unisimt_master): New helper function. (nvptx_get_unisimt_predicate): Ditto. (nvptx_call_insn_is_syscall_p): Ditto. (nvptx_unisimt_handle_set): Ditto. (nvptx_reorg_uniform_simt): New. Transform code for -muniform-simt. (nvptx_reorg): Call nvptx_reorg_uniform_simt. (nvptx_handle_shared_attribute): New. Use it... (nvptx_attribute_table): ... here (new entry). (nvptx_record_offload_symbol): Handle NULL attributes. (nvptx_file_end): Handle need_softstack_decl and need_unisimt_decl. (nvptx_simt_vf): New. (TARGET_SIMT_VF): Define. * config/nvptx/nvptx.h (TARGET_CPU_CPP_BUILTINS): Define __nvptx_softstack or __nvptx_unisimt__ when -msoft-stack, or resp. -muniform-simt option is active. (STACK_SIZE_MODE): Define. (FIXED_REGISTERS): Adjust. (SOFTSTACK_SLOT_REGNUM): New. (SOFTSTACK_PREV_REGNUM): New. (REGISTER_NAMES): Adjust. (struct machine_function): New fields. * config/nvptx/nvptx.md (UNSPEC_SET_SOFTSTACK): New. (UNSPEC_VOTE_BALLOT): Ditto. (UNSPEC_LANEID): Ditto. (UNSPECV_NOUNROLL): Ditto. (atomic): New attribute. (predicable): New attribute. Generate predicated forms via define_cond_exec. (br_true): Mark as not predicable. (br_false): Ditto. (br_true_uni): Ditto. (br_false_uni): Ditto. (return): Ditto. (trap_if_true): Ditto. (trap_if_false): Ditto. (nvptx_fork): Ditto. (nvptx_forked): Ditto. (nvptx_joining): Ditto. (nvptx_join): Ditto. (nvptx_barsync): Ditto. (epilogue): Emit stack restore if TARGET_SOFT_STACK. (allocate_stack): Implement for TARGET_SOFT_STACK. Remove unused code. (allocate_stack_<mode>): Remove unused pattern. (set_softstack_insn): New pattern. (restore_stack_block): Handle for TARGET_SOFT_STACK. (nvptx_vote_ballot): New pattern. (omp_simt_lane): Ditto. (omp_simt_last_lane): Ditto. (omp_simt_ordered): Ditto. (omp_simt_vote_any): Ditto. (omp_simt_xchg_bfly): Ditto. (omp_simt_xchg_idx): Ditto. (nvptx_nounroll): Ditto. (atomic_compare_and_swap<mode>_1): Mark with atomic attribute. (atomic_exchange<mode>): Ditto. (atomic_fetch_add<mode>): Ditto. (atomic_fetch_addsf): Ditto. (atomic_fetch_<logic><mode>): Ditto. * config/nvptx/nvptx.opt: (msoft-stack): New option. (muniform-simt): Ditto. (mgomp): Ditto. * config/nvptx/t-nvptx (MULTILIB_OPTIONS): New. * doc/extend.texi (Nvidia PTX Variable Attributes): New section. * doc/invoke.texi (msoft-stack): Document. (muniform-simt): Document (mgomp): Document. * doc/tm.texi: Regenerate. * doc/tm.texi.in: (TARGET_SIMT_VF): New hook. * target.def: Define it. * target-insns.def (omp_simt_lane): New. (omp_simt_last_lane): New. (omp_simt_ordered): New. (omp_simt_vote_any): New. (omp_simt_xchg_bfly): New. (omp_simt_xchg_idx): New. libgcc/ * config/nvptx/crt0.c (__main): Setup __nvptx_stacks and __nvptx_uni. * config/nvptx/mgomp.c: New file. * config/nvptx/t-nvptx: Add mgomp.c gcc/testsuite/ * lib/target-supports.exp (check_effective_target_alloca): Use a compile test. * gcc.target/nvptx/softstack.c: New test. * gcc.target/nvptx/decl-shared.c: New test. * gcc.target/nvptx/decl-shared-init.c: New test. From-SVN: r242503
Diffstat (limited to 'gcc/doc')
-rw-r--r--gcc/doc/extend.texi15
-rw-r--r--gcc/doc/invoke.texi31
-rw-r--r--gcc/doc/tm.texi4
-rw-r--r--gcc/doc/tm.texi.in2
4 files changed, 52 insertions, 0 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 0669f79..4dcc7f6 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5576,6 +5576,7 @@ attributes.
* MeP Variable Attributes::
* Microsoft Windows Variable Attributes::
* MSP430 Variable Attributes::
+* Nvidia PTX Variable Attributes::
* PowerPC Variable Attributes::
* RL78 Variable Attributes::
* SPU Variable Attributes::
@@ -6257,6 +6258,20 @@ same name (@pxref{MSP430 Function Attributes}).
These attributes can be applied to both functions and variables.
@end table
+@node Nvidia PTX Variable Attributes
+@subsection Nvidia PTX Variable Attributes
+
+These variable attributes are supported by the Nvidia PTX back end:
+
+@table @code
+@item shared
+@cindex @code{shared} attribute, Nvidia PTX
+Use this attribute to place a variable in the @code{.shared} memory space.
+This memory space is private to each cooperative thread array; only threads
+within one thread block refer to the same instance of the variable.
+The runtime does not initialize variables in this memory space.
+@end table
+
@node PowerPC Variable Attributes
@subsection PowerPC Variable Attributes
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1d24b31..620225c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20570,6 +20570,37 @@ offloading execution.
Apply partitioned execution optimizations. This is the default when any
level of optimization is selected.
+@item -msoft-stack
+@opindex msoft-stack
+Generate code that does not use @code{.local} memory
+directly for stack storage. Instead, a per-warp stack pointer is
+maintained explicitly. This enables variable-length stack allocation (with
+variable-length arrays or @code{alloca}), and when global memory is used for
+underlying storage, makes it possible to access automatic variables from other
+threads, or with atomic instructions. This code generation variant is used
+for OpenMP offloading, but the option is exposed on its own for the purpose
+of testing the compiler; to generate code suitable for linking into programs
+using OpenMP offloading, use option @option{-mgomp}.
+
+@item -muniform-simt
+@opindex muniform-simt
+Switch to code generation variant that allows to execute all threads in each
+warp, while maintaining memory state and side effects as if only one thread
+in each warp was active outside of OpenMP SIMD regions. All atomic operations
+and calls to runtime (malloc, free, vprintf) are conditionally executed (iff
+current lane index equals the master lane index), and the register being
+assigned is copied via a shuffle instruction from the master lane. Outside of
+SIMD regions lane 0 is the master; inside, each thread sees itself as the
+master. Shared memory array @code{int __nvptx_uni[]} stores all-zeros or
+all-ones bitmasks for each warp, indicating current mode (0 outside of SIMD
+regions). Each thread can bitwise-and the bitmask at position @code{tid.y}
+with current lane index to compute the master lane index.
+
+@item -mgomp
+@opindex mgomp
+Generate code for use in OpenMP offloading: enables @option{-msoft-stack} and
+@option{-muniform-simt} options, and selects corresponding multilib variant.
+
@end table
@node PDP-11 Options
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 85341ae..84bba07 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5862,6 +5862,10 @@ usable. In that case, the smaller the number is, the more desirable it is
to use it.
@end deftypefn
+@deftypefn {Target Hook} int TARGET_SIMT_VF (void)
+Return number of threads in SIMT thread group on the target.
+@end deftypefn
+
@deftypefn {Target Hook} bool TARGET_GOACC_VALIDATE_DIMS (tree @var{decl}, int *@var{dims}, int @var{fn_level})
This hook should check the launch dimensions provided for an OpenACC
compute region, or routine. Defaulted values are represented as -1
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 400d574..9afd5daa 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4295,6 +4295,8 @@ address; but often a machine-dependent strategy can generate better code.
@hook TARGET_SIMD_CLONE_USABLE
+@hook TARGET_SIMT_VF
+
@hook TARGET_GOACC_VALIDATE_DIMS
@hook TARGET_GOACC_DIM_LIMIT