amdgcn, libgomp: low-latency allocator

This implements the OpenMP low-latency memory allocator for AMD GCN using the small per-team LDS memory (Local Data Store). Since addresses can now refer to LDS space, the "Global" address space is no-longer compatible. This patch therefore switches the backend to use entirely "Flat" addressing (which supports both memories). A future patch will re-enable "global" instructions for cases where it is known to be safe to do so. gcc/ChangeLog: * config/gcn/gcn-builtins.def (DISPATCH_PTR): New built-in. * config/gcn/gcn.cc (gcn_init_machine_status): Disable global addressing. (gcn_expand_builtin_1): Implement GCN_BUILTIN_DISPATCH_PTR. libgomp/ChangeLog: * config/gcn/libgomp-gcn.h (TEAM_ARENA_START): Move to here. (TEAM_ARENA_FREE): Likewise. (TEAM_ARENA_END): Likewise. (GCN_LOWLAT_HEAP): New. * config/gcn/team.c (LITTLEENDIAN_CPU): New, and import hsa.h. (__gcn_lowlat_init): New prototype. (gomp_gcn_enter_kernel): Initialize the low-latency heap. * libgomp.h (TEAM_ARENA_START): Move to libgomp.h. (TEAM_ARENA_FREE): Likewise. (TEAM_ARENA_END): Likewise. * plugin/plugin-gcn.c (lowlat_size): New variable. (print_kernel_dispatch): Label the group_segment_size purpose. (init_environment_variables): Read GOMP_GCN_LOWLAT_POOL. (create_kernel_dispatch): Pass low-latency head allocation to kernel. (run_kernel): Use shadow; don't assume values. * testsuite/libgomp.c/omp_alloc-traits.c: Enable for amdgcn. * config/gcn/allocator.c: New file. * libgomp.texi: Document low-latency implementation details.
author: Andrew Stubbs <ams@codesourcery.com> 2023-01-30 14:43:00 +0000
committer: Andrew Stubbs <ams@codesourcery.com> 2023-12-06 16:48:57 +0000
commit: e7d6c277fa28c0b9b621d23c471e0388d2912644 (patch)
tree: 3ef9390ef49f8deefa281fd7ad2a145ad85254a6 /libgomp/testsuite
parent: e9a19ead498fcc89186b724c6e76854f7751a89b (diff)
download: gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.zip
gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.tar.gz
gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.tar.bz2
1 files changed, 1 insertions, 1 deletions
diff --git a/libgomp/testsuite/libgomp.c/omp_alloc-traits.c b/libgomp/testsuite/libgomp.c/omp_alloc-traits.c
index 4ff0fca..e9acc86 100644
--- a/libgomp/testsuite/libgomp.c/omp_alloc-traits.c
+++ b/libgomp/testsuite/libgomp.c/omp_alloc-traits.c
@@ -1,7 +1,7 @@
 /* { dg-do run } */
 
 /* { dg-require-effective-target offload_device } */
-/* { dg-xfail-if "not implemented" { ! offload_target_nvptx } } */
+/* { dg-xfail-if "not implemented" { ! { offload_target_nvptx || offload_target_amdgcn } } } */
 
 /* Test that GPU low-latency allocation is limited to team access.  */
author	Andrew Stubbs <ams@codesourcery.com>	2023-01-30 14:43:00 +0000
committer	Andrew Stubbs <ams@codesourcery.com>	2023-12-06 16:48:57 +0000
commit	e7d6c277fa28c0b9b621d23c471e0388d2912644 (patch)
tree	3ef9390ef49f8deefa281fd7ad2a145ad85254a6 /libgomp/testsuite
parent	e9a19ead498fcc89186b724c6e76854f7751a89b (diff)
download	gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.zip gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.tar.gz gcc-e7d6c277fa28c0b9b621d23c471e0388d2912644.tar.bz2