Age | Commit message (Collapse) | Author | Files | Lines |
|
Linux kernel v5.9 added support for system calls using the scv
instruction for POWER9 and later. The new codepath provides better
performance (see below) if compared to using sc. For the
foreseeable future, both sc and scv mechanisms will co-exist, so this
patch enables glibc to do a runtime check and use scv when it is
available.
Before issuing the system call to the kernel, we check hwcap2 in the TCB
for PPC_FEATURE2_SCV to see if scv is supported by the kernel. If not,
we fallback to sc and keep the old behavior.
The kernel implements a different error return convention for scv, so
when returning from a system call we need to handle the return value
differently depending on the instruction we used to enter the kernel.
For syscalls implemented in ASM, entry and exit are implemented by
different macros (PSEUDO and PSEUDO_RET, resp.), which may be used in
sequence (e.g. for templated syscalls) or with other instructions in
between (e.g. clone). To avoid accessing the TCB a second time on
PSEUDO_RET to check which instruction we used, the value read from
hwcap2 is cached on a non-volatile register.
This is not needed when using INTERNAL_SYSCALL macro, since entry and
exit are bundled into the same inline asm directive.
The dynamic loader may issue syscalls before the TCB has been setup
so it always uses sc with no extra checks. For the static case, there
is no compile-time way to determine if we are inside startup code,
so we also check the value of the thread pointer before effectively
accessing the TCB. For such situations in which the availability of
scv cannot be determined, sc is always used.
Support for scv in syscalls implemented in their own ASM file (clone and
vfork) will be added later. For now simply use sc as before.
Average performance over 1M calls for each syscall "type":
- stat: C wrapper calling INTERNAL_SYSCALL
- getpid: templated ASM syscall
- syscall: call to gettid using syscall function
Standard:
stat : 1.573445 us / ~3619 cycles
getpid : 0.164986 us / ~379 cycles
syscall : 0.162743 us / ~374 cycles
With scv:
stat : 1.537049 us / ~3535 cycles <~ -84 cycles / -2.32%
getpid : 0.109923 us / ~253 cycles <~ -126 cycles / -33.25%
syscall : 0.116410 us / ~268 cycles <~ -106 cycles / -28.34%
Tested on powerpc, powerpc64, powerpc64le (with and without scv)
Tested-by: Lucas A. M. Magalhães <lamm@linux.ibm.com>
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
Similar to the fix 69fda43b8d, save and restore errno for the hook
functions used for MALLOC_CHECK_=3.
It fixes the malloc/tst-free-errno-mcheck regression.
Checked on x86_64-linux-gnu.
|
|
Add some tests for fpclassify, isnan, isinf and issignaling.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Add support to treat pseudo-numbers specially and implement x86
version to consider all of them as signaling.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
With xmknod wrapper functions removed (589260cef8), the mknod functions
are now properly exported, and version is done using symbols versioning
instead of the extra _MKNOD_* argument.
It also allows us to consolidate Linux and Hurd mknod implementation.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
|
|
With xstat wrapper functions removed (8ed005daf0), the stat functions
are now properly exported, and version is done using symbols versioning
instead of the extra _STAT_* argument.
Reviewed-by: Lukasz Majewski <lukma@denx.de>
|
|
In the next release of POSIX, free must preserve errno
<https://www.austingroupbugs.net/view.php?id=385>.
Modify __libc_free to save and restore errno, so that
any internal munmap etc. syscalls do not disturb the caller's errno.
Add a test malloc/tst-free-errno.c (almost all by Bruno Haible),
and document that free preserves errno.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
We need it to get the RPC API version.
|
|
The new __proc_waitid RPC now expects WEXITED to be passed, allowing to
properly implement waitid, and thus define the missing W* macros
(according to FreeBSD values).
|
|
Tests such as posix/tst-waitid.c make heavy use of
support_process_state_wait, and thus on non-Linux where it falls back
to sleeping, a 2s sleep makes such test time out, while 1s remains
fine enough.
|
|
Instead of having the arch-specific trampoline setup code detect whether
preemption happened or not, we'd rather pass it the sigaction. In the
future, this may also allow to change sa_flags from post_signal().
|
|
When argv is empty, we need to add the original script to be run on the
shell command line.
|
|
Reported by Bruno Haible <bruno@clisp.org>
|
|
Remove _ISOMAC check from <cpu-features.h> since it isn't an installer
header file.
|
|
CPU_FEATURE_CPU_P is defined in sysdeps/x86/sys/platform/x86.h. Remove
the duplicated CPU_FEATURE_CPU_P in sysdeps/x86/include/cpu-features.h.
|
|
Do not attempt to fix the significand top bit in long double input
received in printf. The code should never reach here because isnan
should now detect unnormals as NaN. This is already a NOP for glibc
since it uses the gcc __builtin_isnan, which detects unnormals as NaN.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
This syncs up isnanl behaviour with gcc. Also move the isnanl
implementation to sysdeps/x86 and remove the sysdeps/x86_64 version.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Also move sysdeps/i386/fpu/s_fpclassifyl.c to
sysdeps/x86/fpu/s_fpclassifyl.c and remove
sysdeps/x86_64/fpu/s_fpclassifyl.c
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
The MTE patch to add malloc support incorrectly padded the size passed
to _int_realloc by SIZE_SZ when it ought to have sent just the
chunksize. Revert that bit of the change so that realloc works
correctly with MALLOC_CHECK_ set.
This also brings the realloc_check implementation back in sync with
libc_realloc.
|
|
This new variable allows various subsystems in glibc to run all or
some of their tests with MALLOC_CHECK_=3. This patch adds
infrastructure support for this variable as well as an implementation
in malloc/Makefile to allow running some of the tests with
MALLOC_CHECK_=3.
At present some tests in malloc/ have been excluded from the mcheck
tests either because they're specifically testing MALLOC_CHECK_ or
they are failing in master even without the Memory Tagging patches
that prompted this work. Some tests were reviewed and found to need
specific error points that MALLOC_CHECK_ defeats by terminating early
but a thorough review of all tests is needed to bring them into mcheck
coverage.
The following failures are seen in current master:
FAIL: malloc/tst-malloc-fork-deadlock-mcheck
FAIL: malloc/tst-malloc-stats-cancellation-mcheck
FAIL: malloc/tst-malloc-thread-fail-mcheck
FAIL: malloc/tst-realloc-mcheck
FAIL: malloc/tst-reallocarray-mcheck
All of these are due to the Memory Tagging patchset and will be fixed
separately.
|
|
|
|
The ferror results in an unnecessary PLT reference. Use
__ferror_unlocked instead , which gets inlined.
|
|
For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710,
as seen on a z13 system.
|
|
For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710,
as seen on a POWER8 system.
|
|
The addmntent function replicates elements of struct mnt on stack
using alloca, which is unsafe. Put characters directly into the
stream, escaping them as they're being written out.
Also add a test to check all escaped characters with addmntent and
getmntent.
|
|
Add Intel Linear Address Masking (LAM) support to <sys/platform/x86.h>.
HAS_CPU_FEATURE (LAM) can be used to detect if LAM is enabled in CPU.
LAM modifies the checking that is applied to 64-bit linear addresses,
allowing software to use of the untranslated address bits for metadata.
|
|
For new inputs added in commit cad5ad81d2f7f58a7ad0d8afa8c1b710.
|
|
For new test cases in
commit cad5ad81d2f7f58a7ad0d8afa8c1b7101a0301fb
|
|
This final patch provides the architecture-specific implementation of
the memory-tagging support hooks for aarch64.
|
|
Add various defines and stubs for enabling MTE on AArch64 sysv-like
systems such as Linux. The HWCAP feature bit is copied over in the
same way as other feature bits. Similarly we add a new wrapper header
for mman.h to define the PROT_MTE flag that can be used with mmap and
related functions.
We add a new field to struct cpu_features that can be used, for
example, to check whether or not certain ifunc'd routines should be
bound to MTE-safe versions.
Finally, if we detect that MTE should be enabled (ie via the glibc
tunable); we enable MTE during startup as required.
Support in the Linux kernel was added in version 5.10.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
|
|
Older versions of the Linux kernel headers obviously lack support for
memory tagging, but we still want to be able to build in support when
using those (obviously it can't be enabled on such systems).
The linux kernel extensions are made to the platform-independent
header (linux/prctl.h), so this patch takes a similar approach.
|
|
This patch adds the basic support for memory tagging.
Various flavours are supported, particularly being able to turn on
tagged memory at run-time: this allows the same code to be used on
systems where memory tagging support is not present without neededing
a separate build of glibc. Also, depending on whether the kernel
supports it, the code will use mmap for the default arena if morecore
does not, or cannot support tagged memory (on AArch64 it is not
available).
All the hooks use function pointers to allow this to work without
needing ifuncs.
Reviewed-by: DJ Delorie <dj@redhat.com>
|
|
Add a new glibc tunable: mem.tagging. This is a decimal constant in
the range 0-255 but used as a bit-field.
Bit 0 enables use of tagged memory in the malloc family of functions.
Bit 1 enables precise faulting of tag failure on platforms where this
can be controlled.
Other bits are currently unused, but if set will cause memory tag
checking for the current process to be enabled in the kernel.
|
|
This patch adds the configuration machinery to allow memory tagging to be
enabled from the command line via the configure option --enable-memory-tagging.
The current default is off, though in time we may change that once the API
is more stable.
|
|
This is clever, but it confuses downstream detection in at least zstd
and GNOME's glib. zstd has preprocessor tests for the 'st_mtime' macro,
which is not provided by the path using the anonymous union; glib checks
for the presence of 'st_mtimensec' in struct stat but then tries to
access that field in struct statx (which might be a bug on its own).
Checked with a build for alpha-linux-gnu.
|
|
|
|
setjmp() uses C code to store current registers into jmp_buf
environment. -fstack-protector-all places canary into setjmp()
prologue and clobbers 'a5' before it gets saved.
The change inhibits stack canary injection to avoid clobber.
|
|
add iconv_close before the function returned with bad value.
|
|
|
|
The byte 0xfe as input to the EUC-KR conversion denotes a user-defined
area and is not allowed. The from_euc_kr function used to skip two bytes
when told to skip over the unknown designation, potentially running over
the buffer end.
|
|
Mach actually rather fills the uesp field, not esp.
|
|
This change is required in order to correctly release per-thread
resources. Directly reusing the threading library reference isn't
possible since the sigstate is also used early in the main thread,
before threading is initialized.
* hurd/hurd/signal.h (_hurd_self_sigstate): Drop thread reference after
calling _hurd_thread_sigstate.
(_hurd_critical_section_lock): Likewise.
* hurd/hurdsig.c (_hurd_thread_sigstate): Add a reference on the thread.
(_hurd_sigstate_delete): Drop thread reference.
|
|
When SA_SIGINFO is available, sysdeps/posix/s?profil.c use it, so we have to
fix the __profil_counter function accordingly, using sigcontextinfo.h's
sigcontext_get_pc.
|
|
SA_SIGINFO is actually just another way of expressing what we were
already passing over with struct sigcontext. This just introduces the
SIGINFO interface and fixes the posix values when that interface is
requested by the application.
|
|
x86 binaries are linked at 0x08000000, so we need to let them get mapped
there.
|
|
dl-sysdep has been wanting to use high bits in the vm_map mask for decades,
but that was only implemented lately.
|
|
When e.g. mmap is passed an invalid address we would return
KERN_INVALID_ADDRESS, while POSIX applications would expect EINVAL.
|
|
The __sin32 and __cos32 functions were only used in the now removed slow
path of asin and acos.
|
|
asin and acos have slow paths for rounding the last bit that cause some
calls to be 500-1500x slower than average calls.
These slow paths are rare, a test of a trillion (1.000.000.000.000)
random inputs between -1 and 1 showed 32870 slow calls for acos and 4473
for asin, with most occurrences between -1.0 .. -0.9 and 0.9 .. 1.0.
The slow paths claim correct rounding and use __sin32() and __cos32()
(which compare two result candidates and return the closest one) as the
final step, with the second result candidate (res1) having a small offset
applied from res. This suggests that res and res1 are intended to be 1
ULP apart (which makes sense for rounding), barring bugs, allowing us to
pick either one and still remain within 1 ULP of the exact result.
Remove the slow paths as the accuracy is better than 1 ULP even without
them, which is enough for glibc.
Also remove code comments claiming correctly rounded results.
After slow path removal, checking the accuracy of 14.400.000.000 random
asin() and acos() inputs showed only three incorrectly rounded
(error > 0.5 ULP) results:
- asin(-0x1.ee2b43286db75p-1) (0.500002 ULP, same as before)
- asin(-0x1.f692ba202abcp-4) (0.500003 ULP, same as before)
- asin(-0x1.9915e876fc062p-1) (0.50000000001 ULP, previously exact)
The first two had the same error even before this commit, and they did
not use the slow path at all.
Checking 4934 known randomly found previously-slow-path asin inputs
shows 25 calls with incorrectly rounded results, with a maximum error of
0.500000002 ULP (for 0x1.fcd5742999ab8p-1). The previous slow-path code
rounded all these inputs correctly (error < 0.5 ULP).
The observed average speed increase was 130x.
Checking 36240 known randomly found previously-slow-path acos inputs
shows 42 calls with incorrectly rounded results, with a maximum error of
0.500000008 ULP (for 0x1.f63845056f35ep-1). The previous "exact"
slow-path code showed 34 calls with incorrectly rounded results, with the
same maximum error of 0.500000008 ULP (for 0x1.f63845056f35ep-1).
The observed average speed increase was 130x.
The functions could likely be trimmed more while keeping acceptable
accuracy, but this at least gets rid of the egregiously slow cases.
Tested on x86_64.
|
|
The len variable is only used in the else branch.
We don't need the call to strlen if the name is 0 or 1 characters long.
2019-10-02 Lode Willems <Lode.Willems@UGent.be>
* tdlib/getenv.c: Move the call to strlen into the branch it's used.
|