Age | Commit message (Collapse) | Author | Files | Lines |
|
The iovec size should account for all substrings between each conversion
specification. For the format:
"abc %s efg"
The list of substrings are:
["abc ", arg, " efg]
which is 2 times the number of maximum arguments *plus* one.
This issue triggered 'out of bounds' errors by stdlib/tst-bz20544 when
glibc is built with experimental UBSAN support [1].
Besides adjusting the iovec size, a new runtime and check is added to
avoid wrong __libc_message_impl usage.
Checked on x86_64-linux-gnu.
[1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/ubsan-undef
Co-authored-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
|
|
During early startup memcpy or memset must not be called since many targets
use ifuncs for them which won't be initialized yet. Security hardening may
use -ftrivial-auto-var-init=zero which inserts calls to memset. Redirect
memset to memset_generic by including dl-symbol-redir-ifunc.h in cpu-features.c.
This fixes BZ #33112.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
The generic implementation is slight more optimized than the powerpc
one, where it has a more optimized inf/nan check (by not using FP
unit checks, along with branch prediction hints), and removed one
branch by issuing trunc instead of a combination of floor/ceil (which
also generated less code).
On power10 with gcc 14.2.1:
reciprocal-throughput master patch difference
workload-0_1 1.1351 0.9067 20.12%
workload-1_maxint 1.4230 0.9040 36.47%
workload-maxint_maxfloat 1.5038 0.9076 39.65%
workload-integral 1.1280 0.9111 19.23%
latency master patch difference
workload-0_1 1.1440 2.7117 -137.03%
workload-1_maxint 4.0556 2.7070 33.25%
workload-maxint_maxfloat 3.2122 2.7164 15.43%
workload-integral 3.2381 2.7281 15.75%
Checked on powerpc64le-linux-gnu.
Reviewed-by: Sachin Monga <smonga@linux.ibm.com>
|
|
The generic implementation is slight more optimized than the powerpc
one, where it has a more optimized inf/nan check (by not using FP
unit checks, along with branch prediction hints), and removed one
branch by issuing trunc instead of a combination of floor/ceil (which
also generated less code).
On power10 with gcc 14.2.1:
reciprocal-throughput master patch difference
workload-0_1 1.5210 1.3942 8.34%
workload-1_maxint 2.0926 1.3940 33.38%
workload-maxint_maxfloat 1.7851 1.3940 21.91%
workload-integral 1.5216 1.3941 8.37%
latency master patch difference
workload-0_1 1.5928 2.6337 -65.35%
workload-1_maxint 3.2929 2.6337 20.02%
workload-maxint_maxfloat 1.9697 2.6341 -33.73%
workload-integral 2.0597 2.6337 -27.87%
Checked on powerpc64le-linux-gnu.
Reviewed-by: Sachin Monga <smonga@linux.ibm.com>
|
|
Make '__close_nocancel_nostatus' standalone. This is a generic version
analogous to '__close_nocancel'. Platforms may choose to implement an
inline variant instead where the syscall invocation code sequence is
short enough to be beneficial over a function call.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
Fix fallout from commit c181840c93d3 ("Consolidate non cancellable close
call") that caused '__close_nocancel_nostatus' to clobber 'errno' on a
close(2) failure, a 2.27 regression.
The problem came from a rewrite from 'close_not_cancel_no_status' to
'__close_nocancel_nostatus' switching from an inline implementation that
used INTERNAL_SYSCALL macro (which stays away from 'errno') to a call to
'__close_nocancel' function that uses INLINE_SYSCALL_CALL macro (which
does poke at 'errno').
Implement '__close_nocancel_nostatus' in terms of INTERNAL_SYSCALL_CALL
then, which leaves 'errno' intact.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
Linux kernel >= 6.16 has getrandom() in vDSO for RISC-V. Enable the use
of it in Glibc so it would benefit the programs using the Glibc high
quality random number functions.
Link: https://git.kernel.org/torvalds/c/ee0d03053e70
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
They were misattributed as POWER9 sources.
|
|
When building on GNU/Hurd the following warnings repeat themselves:
../Rules:400: target '/home/collin/obj/glibc/io/test-lfs.out' given more than once in the same rule
../Rules:400: target '/home/collin/obj/glibc/io/test-lfs.out' given more than once in the same rule
This is because commit 73b854e955 (hurd: Mark more memory-hungry tests
as unsupported, 2025-01-12) added it to 'tests-unsupported' even though
it was already added by decf02d382 (hurd: Mark two tests as unsupported,
2023-04-13).
Message-ID: <54dc6bf7e0dbedb1b19356f41fec843c1c523b11.1750130025.git.collin.funk1@gmail.com>
|
|
When building on GNU/Hurd warnings like the following occur:
../sysdeps/x86_64/multiarch/strnlen-evex-base.S:53:10: warning: "P2ALIGN" redefined
53 | # define P2ALIGN(...) .p2align 4,, 6
| ^~~~~~~
In file included from /usr/include/x86_64-gnu/mach/x86_64/syscall_sw.h:30,
from ../sysdeps/mach/sysdep.h:21,
from ../sysdeps/mach/x86/sysdep.h:31,
from ../sysdeps/x86_64/multiarch/strnlen-evex-base.S:24:
/usr/include/x86_64-gnu/mach/x86_64/asm.h:78:9: note: this is the location of the previous definition
78 | #define P2ALIGN(p2) .p2align p2 /* gas-specific */
| ^~~~~~~
The fix is to undefine the macro from system headers in sysdep.h so that
it can be properly defined in assembly files where its definition
depends on whether string functions are being compiled for
wide-characters or not.
Message-ID: <721cd3a1bae1a553857db1dd69761a175f611364.1750131904.git.collin.funk1@gmail.com>
|
|
Update tst-gnu2-tls2 tests to set XMM0...XMM7 to all 1s in malloc to
verify that XMM registers are preserved when _dl_tlsdesc_dynamic is
called by clearing vectors with zeroed XMM registers before
_dl_tlsdesc_dynamic and using these XMM registers to clear vectors
after _dl_tlsdesc_dynamic. This improves the BZ #31372 test.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
|
|
Compiler generates the following instruction sequence for dynamic TLS
access:
leal tls_var@tlsgd(,%ebx,1), %eax
call ___tls_get_addr@PLT
CALL instruction is transparent to compiler which assumes all registers,
except for EFLAGS, AX, CX, and DX, are unchanged after CALL. But
___tls_get_addr is a normal function which doesn't preserve any vector
registers.
1. Rename the generic __tls_get_addr function to ___tls_get_addr_internal.
2. Change ___tls_get_addr to a wrapper function with implementations for
FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all vector registers.
3. dl-tlsdesc-dynamic.h has:
_dl_tlsdesc_dynamic:
/* Like all TLS resolvers, preserve call-clobbered registers.
We need two scratch regs anyway. */
subl $32, %esp
cfi_adjust_cfa_offset (32)
It is wrong to use
movl %ebx, -28(%esp)
movl %esp, %ebx
cfi_def_cfa_register(%ebx)
...
mov %ebx, %esp
cfi_def_cfa_register(%esp)
movl -28(%esp), %ebx
to preserve EBX on stack. Fix it with:
movl %ebx, 28(%esp)
movl %esp, %ebx
cfi_def_cfa_register(%ebx)
...
mov %ebx, %esp
cfi_def_cfa_register(%esp)
movl 28(%esp), %ebx
4. Update _dl_tlsdesc_dynamic to call ___tls_get_addr_internal directly.
5. Add have-test-mtls-traditional to compile tst-tls23-mod.c with
traditional TLS variant to verify the fix.
6. Define DL_RUNTIME_RESOLVE_REALIGN_STACK in sysdeps/x86/sysdep.h.
This fixes BZ #32996.
Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Refactor the generic implementation to use math_config.h definitions,
and add an alternative one if the ABI supports truncf instructions
(gated through math-use-builtins-trunc.h).
The generic implementation generates similar code on x86_64, while
the optimization one for aarch64 (where truncf is supported as a
builtin by through frintz), the improvements are:
reciprocal-throughput master patch difference
workload-0_1 3.0595 3.0698 -0.34%
workload-1_maxint 5.1747 3.0542 40.98%
workload-maxint_maxfloat 3.4391 3.0349 11.75%
workload-integral 3.2732 3.0293 7.45%
latency master patch difference
workload-0_1 3.5267 4.7107 -33.57%
workload-1_maxint 6.9074 4.7282 31.55%
workload-maxint_maxfloat 3.7210 4.7506 -27.67%
workload-integral 3.8634 4.8137 -24.60%
Checked on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
Refactor the generic implementation to use math_config.h definitions,
and add an alternative one if the ABI supports truncf instructions
(gated through math-use-builtins-trunc.h).
The generic implementation generates similar code for x86_64, while
the optimization path aarch64 (where truncf is supported as a builtin)
through frintz), the improvements are:
reciprocal-throughput master patch difference
workload-0_1 3.0740 3.0326 1.35%
workload-1_maxint 5.2231 3.0436 41.73%
workload-maxint_maxfloat 4.0962 3.0551 25.42%
workload-integral 3.7093 3.0612 17.47%
latency master patch difference
workload-0_1 3.5521 4.7313 -33.20%
workload-1_maxint 6.7148 4.7314 29.54%
workload-maxint_maxfloat 4.0458 4.7518 -17.45%
workload-integral 3.9719 4.7427 -19.40%
Checked on aarch64-linux-gnu and x86_64-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
Improve codegen by packing coefficients.
4% and 2% improvement in throughput microbenchmark on Neoverse V1, for acosh
and atanh respectively.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
Reworke SVE FP64 hyperbolics to use the SVE FEXPA
instruction.
Also update the special case handelling for large
inputs to be entirely vectorised.
Performance improvements on Neoverse V1:
cosh_sve: 19% for |x| < 709, 5x otherwise
sinh_sve: 24% for |x| < 709, 5.9x otherwise
tanh_sve: 12% for |x| < 19, 9x otherwise
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
Improve performance of SVE exps by making better use
of the SVE FEXPA instruction.
Performance improvement on Neoverse V1:
exp2_sve: 21%
exp2f_sve: 24%
exp10f_sve: 23%
expm1_sve: 25%
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|
|
Commit 404526ee2e58f3c075253943ddc9988f4bd6b80c changed _start to write
the last argument to __libc_start_main without taking into consideration
that the function did not create a full stack frame, which leads to
overwriting the argv[0].
|
|
Move Linux-specific termios headers and tests from misc to termios subdir
and install newly added bits/termios-cbaud.h.
|
|
There is no functional change in this patch.
We remove stores and loads to stack, return address signing, and redundant
CFI directives before and after call to __libc_arm_za_disable().
The __libc_arm_za_disable implementation follows special calling convention
that allows to avoid most of the operations that would be necessary for a
call to a normal function (see [1] for details).
First, we rely on __libc_arm_za_disable() not clobbering certain registers,
and we put return address into one of these registers. Now we don't need
to store it on stack, so we don't need to sign return address using PAC.
Second, as a result of the above, we don't need to update the CFI offset.
This patch provides small optimisation avoiding unnecessary store and load
on stack also simplifies assembly code and CFI directives.
[1]: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
No functional change here, just a small refactoring to simplify
using __alloc_gcs() for allocating shadow stacks.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Now that we require at least binutils 2.39 the support for POWER9 and
POWER10 instructions can be assumed.
|
|
netinet/tcp.h.
This patch adds the TCPI_OPT_USEC_TS constant from Linux 6.14 to
sysdeps/gnu/netinet/tcp.h
This patch adds the TCPI_OPT_TFO_CHILD constant from Linux 6.15 to
sysdeps/gnu/netinet/tcp.h
Signed-off-by: Jeremy Harris <jgh@exim.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Test that runs through a fairly large combination of the various
termios speed functions, for the new speed_t interface, for the old
speed_t interface (if enabled), and for the new baud_t interface.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
|
|
The generic code has __ispeed and __ospeed; Linux has c_ispeed and
c_ospeed. Use an anonymous union member to allow both set of names on
all platforms.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
|
|
Add an explicitly numeric interface for baudrate setting. For glibc,
this only announces what is a fair accompli, but this is a plausible
way forward for standardization, and may be possible to infill on
non-compliant systems. The POSIX committee has stated:
[https://www.austingroupbugs.net/view.php?id=1916#c7135]
A future version of this standard is expected to add at least
the following symbolic constants for use as values of objects
of type speed_t: B57600, B115200, B230400, B460800, and
B921600.
Implementations are encouraged to propose additional
interfaces which will make it possible to set and query a
wider range of speeds than just those enumerated by the
constants beginning with B. If a set of common interfaces
emerges between several implementations, a future version of
this standard will likely add those interfaces.
This is exactly that interface.
The use of the term "baud" is due to the need to have a term
contrasting "speed", and it is already well established as a legacy
term -- including in the names of the legacy Bxxx
constants. Futhermore, it *is* valid from the point of view that the
termios interface fundamentally emulates an RS-232 serial port as far
as the application software is concerned.
The documentation states that for the current version of glibc,
speed_t == baud_t, but explicitly declares that this may not be the
case in the future.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Now all platforms unconditionally use the "sane" definitions of the
termios baud constants. Unify them into a common file.
Note: I have made them explicitly unsigned to avoid problems with
compiler warnings for comparisons of unequal signedness or
similar. These constants were historically octal on most platforms,
and so unsigned by default.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Hurd with USE_OLD_TTY was the only remaining platform with speed_t not
containing a proper baud rate. From the looks of it, that code has
long since bitrotted.
Remove the vestiges of USE_OLD_TTY.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
|
|
Linux has supported arbitrary speeds and split speeds in the kernel
since 2008 on all platforms except Alpha (fixed in 2020), but glibc
was never updated to match. This is further complicated by POSIX uses
of macros for the cf[gs]et[io]speed interfaces, rather than plain
numbers, as it really ought to have.
On most platforms, the glibc ABI includes the c_[io]speed fields in
struct termios, but they are incorrectly used. On MIPS and SPARC, they
are entirely missing.
For backwards compatibility, the kernel will still use the legacy
speed fields unless they are set to BOTHER, and will use the legacy
output speed as the input speed if the latter is 0 (== B0). However,
the specific encoding used is visible to user space applications,
including ones other than the one running.
- SPARC and MIPS get a new struct termios, and tc[gs]etattr() is
versioned accordingly. However, the new struct termios is set to be
a strict extension of the old one, which means that cf* interfaces
other than the speed-related ones do not need versioning.
- The Bxxx constants are redefined as equivalent to their integer
values and the legacy Bxxx constants are renamed __Bxxx.
- cf[gs]et[io]speed() and cfsetspeed() are versioned accordingly.
- tcgetattr() and cfset[io]speed() are adjusted to always keep the
c_[io]speed fields correct (unlike earlier versions), but to
canonicalize the representation to ALSO configure the legacy fields
if a valid legacy representation exists.
- tcsetattr(), too, canonicalizes the representation in this way
before passing it to the kernel, to maximize compatibility with
older applications/tools.
- The old IBAUD0 hack is removed; it is no longer necessary since
even the legacy c_cflag baud rate fields have had separate input
values for a long time.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
The powerpc architecture, only, emulates the termios ioctls using the
glibc termios structure. Export the real kernel ones as the termios2
interface; although the kernel doesn't call it termios2, it is exactly
the termios2 interface, and it avoids the namespace clash between the
emulated ioctls and the real kernel ioctls.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
In the kernel, these are <linux/sockios.h>. The differences between
<linux/sockios.h> and the copied data in <bits/ioctls.h> are minor;
mainly some #ifdefs, so try to use <linux/sockios.h> directly; it is
hopefully clean enough these days to use directly.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Replace local_isatty() inlined in libio with a proper function
__isatty_nostatus(). This allows simpler system-specific
implementations that don't need to touch errno at all.
Note: I left the prototype in include/unistd.h (the internal header
file.) It didn't much make sense to me to put it in a different header
(not-cancel.h), but perhaps someone can elucidate the need.
Add such an implementation for Linux, with a generic fallback.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
There is a prototype for an internal __tcsetattr() function in
include/termios.h, but tcsetattr without __ were still declared as the
actual functions.
Make this match the comment and make __tcsetattr() an internal
interface. This will be required to version struct termios for Linux on
MIPS and SPARC.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
This reverts commit 3367d8e180848030d1646f088759f02b8dfe0d6f
Reason for revert: Power10 strcmp clobbers non-volatile vector
registers (Bug 33056)
Tested on ppc64le without regression.
|
|
This reverts commit b9182c793caa05df5d697427c0538936e6396d4b
Reason for revert: Power10 memchr clobbers v20 vector register
(Bug 33059)
This is not a security issue, unlike CVE-2025-5745 and
CVE-2025-5702.
Tested on ppc64le without regression.
|
|
(CVE-2025-5702)
This reverts commit 90bcc8721ef82b7378d2b080141228660e862d56
This change is in the chain of the final revert that fixes the CVE
i.e. 3367d8e180848030d1646f088759f02b8dfe0d6f
Reason for revert: Power10 strcmp clobbers non-volatile vector
registers (Bug 33056)
Tested on ppc64le with no regressions.
|
|
This reverts commit 23f0d81608d0ca6379894ef81670cf30af7fd081
Reason for revert: Power10 strncmp clobbers non-volatile vector
registers (Bug 33060)
Tested on ppc64le with no regressions.
|
|
rtld.c has
extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
...
_dl_rtld_map.l_map_start = (ElfW(Addr)) &__ehdr_start;
_dl_rtld_map.l_map_end = (ElfW(Addr)) _end;
As
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120653
shows, compiler may generate run-time relocation on __ehdr_start with
movq .LC0(%rip), %xmm0
...
.section .data.rel.ro.local,"aw"
.align 8
.LC0:
.quad __ehdr_start
This won't work before run-time relocation is finished in rtld.c. Add
optimization barrier to prevent run-time relocations against __ehdr_start
and _end.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
|
|
Signed-off-by: gfleury <gfleury@disroot.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-ID: <20250613184440.1660335-1-gfleury@disroot.org>
|
|
The third argument to __riscv_hwprobe is the size in bytes of the
cpu bitmask pointed to by the fourth argument, however in the access
attribute (read_only, 4, 3) it is used as an element count (i.e., the
number of unsigned longs that make up the bitmask), resulting in a
false compiler warning:
$ gcc -c hwprobe1.c
hwprobe1.c: In function 'main':
hwprobe1.c:15:11: warning: '__riscv_hwprobe' reading 1024 bytes from a region of size 128 [-Wstringop-overread]
15 | ret = __riscv_hwprobe (pairs, 1, sizeof(cpus), cpus, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hwprobe1.c:9:23: note: source object 'cpus' of size 128
9 | unsigned long int cpus[16];
| ^~~~
In file included from hwprobe1.c:1:
/usr/include/riscv64-linux-gnu/sys/hwprobe.h:66:12: note: in a call to function '__riscv_hwprobe' declared with attribute 'access (read_only, 4, 3)'
66 | extern int __riscv_hwprobe (struct riscv_hwprobe *__pairs, size_t __pair_count,
| ^~~~~~~~~~~~~~~
$
The documentation (https://docs.kernel.org/arch/riscv/hwprobe.html)
claims that the cpu bitmask has the type cpu_set_t *, which would be
consistent with other functions that take a cpu bitmask such as
sched_setaffinity and sched_getaffinity. It also uses the name
cpusetsize for the third argument, which is much more accurate than
cpu_count since it is a size in bytes and not a cpu count. The
(read_only, 4, 3) access attribute in the glibc prototype claims
that the cpu bitmask is only read, however when flags is
RISCV_HWPROBE_WHICH_CPUS it is both read and written.
Therefore, in the glibc prototype the type of the fourth argument is
changed to cpu_set_t * to match the documentation, the name of the
third argument is changed to cpusetsize as in the documentation, and the
incorrect access attribute that applies to these arguments is removed.
Almost all existing callers pass a null pointer for the fourth
argument, however a transparent union is introduced for compatibility
with callers that cast a pointer to the old argument type, and a
macro is introduced allowing callers the ability to distinguish
between the old and new prototype when needed.
The access attributes are being specified with __fortified_attr_access,
however this macro is for fortified functions; the regular
__attr_access macro is for non-fortified functions such as this one.
Using the incorrect macro results in no access checks at fortify level
3, because it is assumed that the fortified function will be doing the
checking. It is changed to use the correct macro so that the access
checks will work regardless of fortify level.
Also because __riscv_hwprobe is not a cancellation point, __THROW
is added, consistent with similar functions. (However, it is omitted
from the typedef because GCC does not accept it there.)
The __wur (warn_unused_result) attribute is helpful for functions that
cannot be used safely without checking the result, however code such
as the following does not require the result to be checked and should
not produce a warning:
struct riscv_hwprobe pair = { RISCV_HWPROBE_KEY_IMA_EXT_0, 0 };
__riscv_hwprobe (&pair, 1, 0, NULL, 0);
if (pair.value & RISCV_HWPROBE_EXT_ZBB) ...
Therefore this attribute is omitted.
The comment claiming that the second argument to the ifunc selector
is a pointer to the vDSO function is corrected. It is a pointer to
the regular glibc function (which returns errors as positive values),
not the vDSO function (which returns errors as negative values).
Fixes commit 426d0e1aa8f17426d13707594111df712d2b8911 ("riscv: Add
Linux hwprobe syscall support").
Fixes: BZ #32932
Signed-off-by: Mark Harris <mark.hsj@gmail.com>
Signed-off-by: Mark Harris <mark.hsj@gmail.com>
Reviewed-by: Palmer Dabbelt <palmer@dabbelt.com>
Acked-by: Palmer Dabbelt <palmer@dabbelt.com>
|
|
|
|
25d37948c9f3 ("malloc: Improve malloc initialization") moved calling malloc
initialization earlier, within _dl_sysdep_start's call to dl_main, before
__mach_init is called by _dl_init_first. But malloc initialization uses
getrandom, which needs to make RPCs.
This adds __getrandom_early_init on hurd to express that getrandom needs
__mach_init too. This also adds a guard to avoid making it create several task
and host ports.
Fixes: 25d37948c9f3 ("malloc: Improve malloc initialization")
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
In init_cpu_features, replace GLRO(dl_x86_cpu_features) with
cpu_features to avoid an extra load.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
Early versions of GCC 12 didn't support -mtune=neoverse-v2, so use
-mtune=neoverse-v1 instead.
Reported-by: Yury Khrustalev <yury.khrustalev@arm.com>
|
|
Support for the SIOCGIFINDEX ioctl(2) Linux ABI (0x8933 command, called
SIOGIFINDEX in the API originally) was added with kernel version 2.1.14
for AF_INET6 sockets, followed by general support with version 2.1.22.
The Linux API was then updated by adding the current SIOCGIFINDEX name
with kernel version 2.1.68, back in Nov 1997.
All these kernel versions are well below our current default required
minimum of 3.2.0, let alone some platform higher version requirements.
Drop support for the absence of the SIOCGIFINDEX ioctl(2) in the API or
ABI, by removing arrangements for the ENOSYS error condition. Discard
the indirection from '__if_nameindex' to 'if_nameindex_netlink' and
adjust the implementation of '__if_nametoindex' accordingly for a better
code flow.
|
|
Add a new helper function __ifunc_hwcap() as a portable way to
access HWCAP elements via the parameter(s) passed to an ifunc
resolver checking the _IFUNC_ARG_HWCAP bit in the first parameter
and size of the buffer in the second parameter.
Note that 0 is returned when the requested element is not available
or does not correspond to a valid AT_HWCAP{,2,...} value.
Also add relevant tests.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
Add basic support for hwcap3 and hwcap4 in dynamic loader and
ifunc resolvers.
Describe new backward-compatible prototype for GNU indirect
function resolvers that use a pointer to uint64_t array in
stead of a pointer to the __ifunc_arg_t struct.
This patch also adds macro _IFUNC_HWCAP_MAX to specify current
number of hwcap elements.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
|
|
sparc start.S does not provide the final argument for
__libc_start_main, which is the highest stack address used to
update the __libc_stack_end.A
This fixes elf/tst-execstack-prog-static-tunable on sparc64.
On sparcv9 this does not happen because the kernel puts an
auxv value, which turns to point to a value in the stack itself.
Checked on sparc64-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
Before:
rt_sigaction(SIGBUS, {sa_handler=0x55abb9960139, sa_mask=[], sa_flags=SA_RESTORER|SA_RESETHAND|SA_SIGINFO|0xffffffff00000000, sa_restorer=0x7fb1b2a82050}, NULL, 8) = 0
After:
rt_sigaction(SIGBUS, {sa_handler=0x7f6a70dce139, sa_mask=[], sa_flags=SA_RESTORER|SA_RESETHAND|SA_SIGINFO, sa_restorer=0x7f6a70c28f60}, NULL, 8) = 0
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
|
|
The new float and double implementation does not required an
extra function call and error handling uses math_err function,
which results in better performance on i386 as well.
With gcc-14 on AMD AMD Ryzen 9 5900X, master shows:
$ ./benchtests/bench-ilogb
"ilogb": {
"subnormal": {
"duration": 3.68863e+09,
"iterations": 1.72228e+08,
"max": 89.2995,
"min": 21.016,
"mean": 21.4171
},
"normal": {
"duration": 3.68878e+09,
"iterations": 1.72948e+08,
"max": 78.6065,
"min": 21.127,
"mean": 21.3288
}
}
$ ./benchtests/bench-ilogbf
"ilogbf": {
"subnormal": {
"duration": 3.68835e+09,
"iterations": 1.66716e+08,
"max": 46.953,
"min": 21.793,
"mean": 22.1236
},
"normal": {
"duration": 3.68784e+09,
"iterations": 1.66168e+08,
"max": 46.9715,
"min": 21.904,
"mean": 22.1935
}
}
While with this patch:
$ ./benchtests/bench-ilogb
"ilogb": {
"subnormal": {
"duration": 3.68134e+09,
"iterations": 4.17516e+08,
"max": 32.5045,
"min": 8.3245,
"mean": 8.81723
},
"normal": {
"duration": 3.6677e+09,
"iterations": 6.79468e+08,
"max": 50.9305,
"min": 5.3465,
"mean": 5.3979
}
}
$ ./benchtests/bench-ilogbf
"ilogbf": {
"subnormal": {
"duration": 3.67553e+09,
"iterations": 5.11032e+08,
"max": 35.927,
"min": 7.0485,
"mean": 7.19237
},
"normal": {
"duration": 3.66877e+09,
"iterations": 6.556e+08,
"max": 26.3625,
"min": 5.5315,
"mean": 5.59605
}
}
Checked on i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
|