diff options
author | Lipeng Zhu <lipeng.zhu@intel.com> | 2023-12-09 10:39:45 -0500 |
---|---|---|
committer | H.J. Lu <(no_default)> | 2023-12-11 09:43:59 -0800 |
commit | b806c88fab3f9c6833563f9a44b608dd5dd14de9 (patch) | |
tree | 3d3efb7c62f71bc7ea89efc6f1fa37ff7b8a124c /libgomp/testsuite | |
parent | 624e274ca3a4405a55662fa72d1163120df0e03d (diff) | |
download | gcc-b806c88fab3f9c6833563f9a44b608dd5dd14de9.zip gcc-b806c88fab3f9c6833563f9a44b608dd5dd14de9.tar.gz gcc-b806c88fab3f9c6833563f9a44b608dd5dd14de9.tar.bz2 |
libgfortran: Replace mutex with rwlock
This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the percentage
to step into the insert_unit function is around 30%, in most instances,
we can get the unit in the phase of reading the unit_cache or unit_root
tree. So split the read/write phase by rwlock would be an approach to
make it more parallel.
BTW, the IPC metrics can gain around 9x in our test
server with 220 cores. The benchmark we used is
https://github.com/rwesson/NEAT
libgcc/ChangeLog:
* gthr-posix.h (__GTHREAD_RWLOCK_INIT): New macro.
(__gthrw): New function.
(__gthread_rwlock_rdlock): New function.
(__gthread_rwlock_tryrdlock): New function.
(__gthread_rwlock_wrlock): New function.
(__gthread_rwlock_trywrlock): New function.
(__gthread_rwlock_unlock): New function.
libgfortran/ChangeLog:
* io/async.c (DEBUG_LINE): New macro.
* io/async.h (RWLOCK_DEBUG_ADD): New macro.
(CHECK_RDLOCK): New macro.
(CHECK_WRLOCK): New macro.
(TAIL_RWLOCK_DEBUG_QUEUE): New macro.
(IN_RWLOCK_DEBUG_QUEUE): New macro.
(RDLOCK): New macro.
(WRLOCK): New macro.
(RWUNLOCK): New macro.
(RD_TO_WRLOCK): New macro.
(INTERN_RDLOCK): New macro.
(INTERN_WRLOCK): New macro.
(INTERN_RWUNLOCK): New macro.
* io/io.h (struct gfc_unit): Change UNIT_LOCK to UNIT_RWLOCK in
a comment.
(unit_lock): Remove including associated internal_proto.
(unit_rwlock): New declarations including associated internal_proto.
(dec_waiting_unlocked): Use WRLOCK and RWUNLOCK on unit_rwlock
instead of __gthread_mutex_lock and __gthread_mutex_unlock on
unit_lock.
* io/transfer.c (st_read_done_worker): Use WRLOCK and RWUNLOCK on
unit_rwlock instead of LOCK and UNLOCK on unit_lock.
(st_write_done_worker): Likewise.
* io/unit.c: Change UNIT_LOCK to UNIT_RWLOCK in 'IO locking rules'
comment. Use unit_rwlock variable instead of unit_lock variable.
(get_gfc_unit_from_unit_root): New function.
(get_gfc_unit): Use RDLOCK, WRLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(close_unit_1): Use WRLOCK and RWUNLOCK on unit_rwlock instead of
LOCK and UNLOCK on unit_lock.
(close_units): Likewise.
(newunit_alloc): Use RWUNLOCK on unit_rwlock instead of UNLOCK on
unit_lock.
* io/unix.c (find_file): Use RDLOCK and RWUNLOCK on unit_rwlock
instead of LOCK and UNLOCK on unit_lock.
(flush_all_units): Use WRLOCK and RWUNLOCK on unit_rwlock instead
of LOCK and UNLOCK on unit_lock.
Diffstat (limited to 'libgomp/testsuite')
-rw-r--r-- | libgomp/testsuite/libgomp.fortran/rwlock_1.f90 | 33 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.fortran/rwlock_2.f90 | 22 | ||||
-rw-r--r-- | libgomp/testsuite/libgomp.fortran/rwlock_3.f90 | 18 |
3 files changed, 73 insertions, 0 deletions
diff --git a/libgomp/testsuite/libgomp.fortran/rwlock_1.f90 b/libgomp/testsuite/libgomp.fortran/rwlock_1.f90 new file mode 100644 index 0000000..f90ecbe --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/rwlock_1.f90 @@ -0,0 +1,33 @@ +! { dg-do run } +! Multiple threads call open/write/read/close in concurrency with different unit number, +! threads can acquire read lock concurrently, to find unit from cache or unit list very frequently, +! if not found, threads will acquire the write lock exclusively to insert unit to cache and unit list. +! This test case is used to stress both the read and write lock when access unit cache and list. +program main + use omp_lib + implicit none + integer:: unit_number, v1, v2, i + character(11) :: file_name + character(3) :: async = "no" + !$omp parallel private (unit_number, v1, v2, file_name, async, i) + do i = 0, 100 + unit_number = 10 + omp_get_thread_num () + write (file_name, "(I3, A)") unit_number, "_tst.dat" + file_name = adjustl(file_name) + open (unit_number, file=file_name, asynchronous="yes") + ! call inquire with file parameter to test find_file in unix.c + inquire (file=file_name, asynchronous=async) + if (async /= "YES") stop 1 + write (unit_number, *, asynchronous="yes") unit_number + write (unit_number, *, asynchronous="yes") unit_number + 1 + close(unit_number) + + open (unit_number, file = file_name, asynchronous="yes") + read (unit_number, *, asynchronous="yes") v1 + read (unit_number, *, asynchronous="yes") v2 + wait (unit_number) + if ((v1 /= unit_number) .or. (v2 /= unit_number + 1)) stop 2 + close(unit_number, status="delete") + end do + !$omp end parallel +end program diff --git a/libgomp/testsuite/libgomp.fortran/rwlock_2.f90 b/libgomp/testsuite/libgomp.fortran/rwlock_2.f90 new file mode 100644 index 0000000..08c80d1 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/rwlock_2.f90 @@ -0,0 +1,22 @@ +! { dg-do run } +! Insert a unit into cache at the beginning, then start multiple +! threads to access the same unit concurrency, unit will be found in unit cache during the read lock phase. +! This test case is used to test the read lock when access unit cache and list. +program main + use omp_lib + implicit none + integer:: thread_id, total_threads, i, j + total_threads = omp_get_max_threads () + open (10, file='tst.dat', asynchronous="yes") + !$omp parallel private (thread_id, i, j) + do i = 1, 100 + thread_id = omp_get_thread_num () + do j = 1, 100 + write (10, *, asynchronous="yes") thread_id, i + end do + end do + !$omp end parallel + ! call inquire with file parameter to test find_file in unix.c + call flush () + close (10, status="delete") +end program diff --git a/libgomp/testsuite/libgomp.fortran/rwlock_3.f90 b/libgomp/testsuite/libgomp.fortran/rwlock_3.f90 new file mode 100644 index 0000000..1906fcd --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/rwlock_3.f90 @@ -0,0 +1,18 @@ +! { dg-do run } +! Find or create the same unit number in concurrency, +! at beginning, threads cannot find the unit in cache or unit list, +! then threads will acquire the write lock to insert unit. +! This test case is used to ensure that no duplicate unit number will be +! inserted into cache nor unit list when same unit was accessed in concurrency. +program main + use omp_lib + implicit none + integer:: i + !$omp parallel private (i) + do i = 1, 100 + open (10, file='tst.dat', asynchronous="yes") + ! Delete the unit number from cache and unit list to stress write lock. + close (10, status="delete") + end do + !$omp end parallel +end program |