Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Change 521c6785e1fc94d added the enum but missed the semicolon.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
|
|
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
|
|
Signed-off-by: Carlos O'Donell <carlos@redhat.com>
---
nptl/
2013-07-19 Dominik Vogt <vogt@de.ibm.com>
* sysdeps/unix/sysv/linux/x86/elision-conf.c:
Remove __rwlock_rtm_enabled and __rwlock_rtm_read_retries.
(elision_init): Don't set __rwlock_rtm_enabled.
* sysdeps/unix/sysv/linux/x86/elision-conf.h:
Remove __rwlock_rtm_enabled.
|
|
The generated header is compiled with `-ffreestanding' to avoid any
circular dependencies against the installed implementation headers.
Such a dependency would require the implementation header to be
installed before the generated header could be built (See bug 15711).
In current practice the generated header dependencies do not include
any of the implementation headers removed by the use of `-ffreestanding'.
---
2013-07-15 Carlos O'Donell <carlos@redhat.com>
[BZ #15711]
* sysdeps/unix/sysv/linux/Makefile ($(objpfx)bits/syscall%h):
Avoid system header dependency with -ffreestanding.
($(objpfx)bits/syscall%d): Likewise.
|
|
* math/libm-test.inc (casin_test_data): Annotate more cases of missing
underflows from atanl/atan2l due to bug 15319.
(casinh_test_data): Likewise.
|
|
|
|
|
|
* sysdeps/sparc/fpu/libm-test-ulps: Regenerate from scratch.
|
|
A recently-added test (dlfcn/tststatic5) pointed out that tile was not
properly initializing the variable pagesize in certain cases. This
change just copies the existing code from MIPS.
|
|
|
|
The sfp-machine.h is based on the gcc version, but extended with
required new macros by comparison with other architectures and by
investigating the hardware support for FP on tile.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* sysdeps/i386/fpu/libm-test-ulps: Update.
|
|
* sysdeps/sparc/fpu/libm-test-ulps: Update.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Can be enabled with --enable-lock-elision=yes at configure time.
|
|
PTHREAD_MUTEX_NORMAL requires deadlock for nesting, DEFAULT
does not. Since glibc uses the same value (0) disable elision
for any call to pthread_mutexattr_settype() with a 0 value.
This implies that a program can disable elision by doing
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_NORMAL)
Based on a original proposal by Rich Felker.
|
|
Add elision paths to the basic mutex locks.
The normal path has a check for RTM and upgrades the lock
to RTM when available. Trylocks cannot automatically upgrade,
so they check for elision every time.
We use a 4 byte value in the mutex to store the lock
elision adaptation state. This is separate from the adaptive
spin state and uses a separate field.
Condition variables currently do not support elision.
Recursive mutexes and condition variables may be supported at some point,
but are not in the current implementation. Also "trylock" will
not automatically enable elision unless some other lock call
has been already called on the lock.
This version does not use IFUNC, so it means every lock has one
additional check for elision. Benchmarking showed the overhead
to be negligible.
|
|
tst-mutex5 and 8 test some behaviour not required by POSIX,
that elision changes. This changes these tests to not check
this when elision is enabled at configure time.
|
|
Add Enable/disable flags used internally
Extend the mutex initializers to have the fields needed for
elision. The layout stays the same, and this is not visible
to programs.
These changes are not exposed outside pthread
|
|
Lock elision using TSX is a technique to optimize lock scaling
It allows to run locks in parallel using hardware support for
a transactional execution mode in 4th generation Intel Core CPUs.
See http://www.intel.com/software/tsx for more Information.
This patch implements a simple adaptive lock elision algorithm based
on RTM. It enables elision for the pthread mutexes and rwlocks.
The algorithm keeps track whether a mutex successfully elides or not,
and stops eliding for some time when it is not.
When the CPU supports RTM the elision path is automatically tried,
otherwise any elision is disabled.
The adaptation algorithm and its tuning is currently preliminary.
The code adds some checks to the lock fast paths. Micro-benchmarks
show little to no difference without RTM.
This patch implements the low level "lll_" code for lock elision.
Followon patches hook this into the pthread implementation
Changes with the RTM mutexes:
-----------------------------
Lock elision in pthreads is generally compatible with existing programs.
There are some obscure exceptions, which are expected to be uncommon.
See the manual for more details.
- A broken program that unlocks a free lock will crash.
There are ways around this with some tradeoffs (more code in hot paths)
I'm still undecided on what approach to take here; have to wait for testing reports.
- pthread_mutex_destroy of a lock mutex will not return EBUSY but 0.
- There's also a similar situation with trylock outside the mutex,
"knowing" that the mutex must be held due to some other condition.
In this case an assert failure cannot be recovered. This situation is
usually an existing bug in the program.
- Same applies to the rwlocks. Some of the return values changes
(for example there is no EDEADLK for an elided lock, unless it aborts.
However when elided it will also never deadlock of course)
- Timing changes, so broken programs that make assumptions about specific timing
may expose already existing latent problems. Note that these broken programs will
break in other situations too (loaded system, new faster hardware, compiler
optimizations etc.)
- Programs with non recursive mutexes that take them recursively in a thread and
which would always deadlock without elision may not always see a deadlock.
The deadlock will only happen on an early or delayed abort (which typically
happens at some point)
This only happens for mutexes not explicitely set to PTHREAD_MUTEX_NORMAL
or PTHREAD_MUTEX_ADAPTIVE_NP. PTHREAD_MUTEX_NORMAL mutexes do not elide.
The elision default can be set at configure time.
This patch implements the basic infrastructure for elision.
|
|
|
|
|
|
|
|
|
|
Now that the fallback functions match the desired semantics for tile
functions, just switch to using them.
|
|
|
|
SSE2/SSSE3 versions are faster than SSE4.2 versions on Intel Silvermont.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|