riscv-gnu-toolchain/glibc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-07	elf: Only process multiple tunable once (BZ 31686)	Adhemerval Zanella	1	-0/+4
	The 680c597e9c3 commit made loader reject ill-formatted strings by first tracking all set tunables and then applying them. However, it does not take into consideration if the same tunable is set multiple times, where parse_tunables_string appends the found tunable without checking if it was already in the list. It leads to a stack-based buffer overflow if the tunable is specified more than the total number of tunables. For instance: GLIBC_TUNABLES=glibc.malloc.check=2:... (repeat over the number of total support for different tunable). Instead, use the index of the tunable list to get the expected tunable entry. Since now the initial list is zero-initialized, the compiler might emit an extra memset and this requires some minor adjustment on some ports. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reported-by: Yuto Maeda <maeda@cyberdefense.jp> Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org> (cherry picked from commit bcae44ea8536b30a7119c0986ff5692bddacb672)
2024-04-09	AArch64: Check kernel version for SVE ifuncs	Wilco Dijkstra	4	-2/+5
	Old Linux kernels disable SVE after every system call. Calling the SVE-optimized memcpy afterwards will then cause a trap to reenable SVE. As a result, applications with a high use of syscalls may run slower with the SVE memcpy. This is true for kernels between 4.15.0 and before 6.2.0, except for 5.14.0 which was patched. Avoid this by checking the kernel version and selecting the SVE ifunc on modern kernels. Parse the kernel version reported by uname() into a 24-bit kernel.major.minor value without calling any library functions. If uname() is not supported or if the version format is not recognized, assume the kernel is modern. Tested-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit 2e94e2f5d2bf2de124c8ad7da85463355e54ccb2)
2024-04-09	aarch64: fix check for SVE support in assembler	Szabolcs Nagy	2	-4/+6
	Due to GCC bug 110901 -mcpu can override -march setting when compiling asm code and thus a compiler targetting a specific cpu can fail the configure check even when binutils gas supports SVE. The workaround is that explicit .arch directive overrides both -mcpu and -march, and since that's what the actual SVE memcpy uses the configure check should use that too even if the GCC issue is fixed independently. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit 73c26018ed0ecd9c807bb363cc2c2ab4aca66a82)
2024-04-09	aarch64/fpu: Sync libmvec routines from 2.39 and before with AOR	Joe Ramsay	18	-105/+111
	This includes a fix for big-endian in AdvSIMD log, some cosmetic changes, and numerous small optimisations mainly around inlining and using indexed variants of MLA intrinsics. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit e302e1021391d13a9611ba3a910df128830bd19e)
2024-04-01	elf: Enable TLS descriptor tests on aarch64	Adhemerval Zanella	1	-0/+1
	The aarch64 uses 'trad' for traditional tls and 'desc' for tls descriptors, but unlike other targets it defaults to 'desc'. The gnutls2 configure check does not set aarch64 as an ABI that uses TLS descriptors, which then disable somes stests. Also rename the internal machinery fron gnu2 to tls descriptors. Checked on aarch64-linux-gnu. Reviewed-by: H.J. Lu <hjl.tools@gmail.com> (cherry picked from commit 3d53d18fc71c5d9ef4773b8bce04d54b80181926)
2024-01-04	aarch64: Make cpu-features definitions not Linux-specific	Sergey Bugaev	2	-0/+110
	These describe generic AArch64 CPU features, and are not tied to a kernel-specific way of determining them. We can share them between the Linux and Hurd AArch64 ports. Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240103171502.1358371-13-bugaevc@gmail.com>
2024-01-02	aarch64: Add longjmp test for SME	Szabolcs Nagy	2	-0/+283
	Includes test for setcontext too. The test directly checks after longjmp if ZA got disabled and the ZA contents got saved following the lazy saving scheme. It does not use ACLE code to verify that gcc can interoperate with glibc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-01-02	aarch64: Add longjmp support for SME	Szabolcs Nagy	1	-0/+22
	For the ZA lazy saving scheme to work, longjmp has to call __libc_arm_za_disable. In ld.so we assume ZA is not used so longjmp does not need special support there. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-01-02	aarch64: Add SME runtime support	Szabolcs Nagy	3	-3/+129
	The runtime support routines for the call ABI of the Scalable Matrix Extension (SME) are mostly in libgcc. Since libc.so cannot depend on libgcc_s.so have an implementation of __arm_za_disable in libc for libc internal use in longjmp and similar APIs. __libc_arm_za_disable follows the same PCS rules as __arm_za_disable, but it's a hidden symbol so it does not need variant PCS marking. Using __libc_fatal instead of abort because it can print a message and works in ld.so too. But for now we don't need SME routines in ld.so. To check the SME HWCAP in asm, we need the _dl_hwcap2 member offset in _rtld_global_ro in the shared libc.so, while in libc.a the _dl_hwcap2 object is accessed. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-01-01	Update copyright dates with scripts/update-copyrights	Paul Eggert	223	-223/+223

2023-12-20	aarch64: Add SIMD attributes to math functions with vector versions	Joe Ramsay	2	-0/+113
	Added annotations for autovec by GCC and GFortran - this enables GCC >= 9 to autovectorise math calls at -Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-20	aarch64: Add half-width versions of AdvSIMD f32 libmvec routines	Joe Ramsay	18	-14/+108
	Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-12-05	aarch64: correct CFI in rawmemchr (bug 31113)	Andreas Schwab	1	-1/+1
	The .cfi_return_column directive changes the return column for the whole FDE range. But the actual intent is to tell the unwinder that the value in x30 (lr) now resides in x15 after the move, and that is expressed by the .cfi_register directive.
2023-12-04	aarch64: fix tested ifunc variants	Szabolcs Nagy	1	-3/+3
	Don't test a64fx string functions when BTI is enabled since they are not BTI compatible.
2023-11-29	aarch64: Improve special-case handling in AdvSIMD double-precision libmvec ↵	Joe Ramsay	1	-1/+7
	routines Avoids emitting many saves/restores of vector registers, reduces the amount of code generated around the scalar fallback.
2023-11-22	aarch64: Fix libmvec benchmarks	Joe Ramsay	2	-49/+81
	These were broken by the new atan2 functions, as they were only set up for univariate functions. Arity is now detected from the input file - this revealed a mistake that the double-precision inputs were being used for both single- and double-precision routines, which is now remedied.
2023-11-21	elf: Remove LD_PROFILE for static binaries	Adhemerval Zanella	2	-2/+4
	The _dl_non_dynamic_init does not parse LD_PROFILE, which does not enable profile for dlopen objects. Since dlopen is deprecated for static objects, it is better to remove the support. It also allows to trim down libc.a of profile support. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-11-20	aarch64: Add vector implementations of expm1 routines	Joe Ramsay	12	-0/+458
	May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
2023-11-13	AArch64: Remove Falkor memcpy	Wilco Dijkstra	5	-324/+0
	The latest implementations of memcpy are actually faster than the Falkor implementations [1], so remove the falkor/phecda ifuncs for memcpy and the now unused IS_FALKOR/IS_PHECDA defines. [1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-11-13	AArch64: Add memset_zva64	Wilco Dijkstra	6	-68/+38
	Add a specialized memset for the common ZVA size of 64 to avoid the overhead of reading the ZVA size. Since the code is identical to __memset_falkor, remove the latter. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-11-13	AArch64: Cleanup emag memset	Wilco Dijkstra	4	-197/+90
	Cleanup emag memset - merge the memset_base64.S file, remove the unused ZVA code (since it is disabled on emag). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2023-11-10	aarch64: Add vector implementations of log1p routines	Joe Ramsay	12	-0/+496
	May discard sign of zero.
2023-11-10	aarch64: Add vector implementations of atan2 routines	Joe Ramsay	14	-0/+531

2023-11-10	aarch64: Add vector implementations of atan routines	Joe Ramsay	12	-0/+403

2023-11-10	aarch64: Add vector implementations of acos routines	Joe Ramsay	12	-1/+436

2023-11-10	aarch64: Add vector implementations of asin routines	Joe Ramsay	12	-1/+403

2023-11-01	AArch64: Cleanup ifuncs	Wilco Dijkstra	18	-125/+41
	Cleanup ifuncs. Remove uses of libc_hidden_builtin_def, use ENTRY rather than ENTRY_ALIGN, remove unnecessary defines and conditional compilation. Rename strlen_mte to strlen_generic. Remove rtld-memset. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-10-24	AArch64: Add support for MOPS memcpy/memmove/memset	Wilco Dijkstra	9	-1/+137
	Add support for MOPS in cpu_features and INIT_ARCH. Add ifuncs using MOPS for memcpy, memmove and memset (use .inst for now so it works with all binutils versions without needing complex configure and conditional compilation). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-10-23	aarch64: Add vector implementations of exp10 routines	Joe Ramsay	12	-0/+524
	Double-precision routines either reuse the exp table (AdvSIMD) or use SVE FEXPA intruction.
2023-10-23	aarch64: Add vector implementations of log10 routines	Joe Ramsay	14	-1/+580
	A table is also added, which is shared between AdvSIMD and SVE log10.
2023-10-23	aarch64: Add vector implementations of log2 routines	Joe Ramsay	14	-1/+545
	A table is also added, which is shared between AdvSIMD and SVE log2.
2023-10-23	aarch64: Add vector implementations of exp2 routines	Joe Ramsay	12	-0/+459
	Some routines reuse table from v_exp_data.c
2023-10-23	aarch64: Add vector implementations of tan routines	Joe Ramsay	18	-1/+1244
	This includes some utility headers for evaluating polynomials using various schemes.
2023-10-05	aarch64: Optimise vecmath logs	Joe Ramsay	7	-215/+226
	* Transpose table layout for improved memory access * Use half-vector special comparisons for AdvSIMD * Improve register use near special-case branches - Due to the presence of a function call, return value would get mov-d out of x0 in order to facilitate PCS. By moving the final computation after the branch this can be avoided Also change SVE routines to use overloaded intrinsics for readability.
2023-10-05	aarch64: Cosmetic change in SVE exp routines	Joe Ramsay	2	-47/+44
	Use overloaded intrinsics for readability. Codegen does not change, however while we're bringing the routines up-to-date with recent improvements to other routines in AOR it is worth copying this change over as well.
2023-10-05	aarch64: Optimize SVE cos & cosf	Joe Ramsay	2	-53/+47
	Saves a mov by ensuring return value does not need to be moved out of the way before special-case branch. Also change to use overloaded intrinsics.
2023-10-05	aarch64: Improve vecmath sin routines	Joe Ramsay	3	-73/+87
	* Update ULP comment reflecting a new observed max in [-pi/2, pi/2] * Use the same polynomial in AdvSIMD and SVE, rather than FTRIG instructions * Improve register use near special-case branch Also use overloaded intrinsics for SVE.
2023-09-26	AArch64: Remove -0.0 check from vector sin	Wilco Dijkstra	2	-12/+2
	Remove the unnecessary extra checks for sin (-0.0) from vector sin/sinf, improving performance. Passes regress. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-07-17	configure: Use autoconf 2.71	Siddhesh Poyarekar	1	-91/+111
	Bump autoconf requirement to 2.71 to allow regenerating configure on more recent distributions. autoconf 2.71 has been in Fedora since F36 and is the current version in Debian stable (bookworm). It appears to be current in Gentoo as well. All sysdeps configure and preconfigure scripts have also been regenerated; all changes are trivial transformations that do not affect functionality. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2023-06-30	aarch64: Add vector implementations of exp routines	Joe Ramsay	14	-1/+593
	Optimised implementations for single and double precision, Advanced SIMD and SVE, copied from Arm Optimized Routines. As previously, data tables are used via a barrier to prevent overly aggressive constant inlining. Special-case handlers are marked NOINLINE to avoid incurring the penalty of switching call standards unnecessarily. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30	aarch64: Add vector implementations of log routines	Joe Ramsay	14	-1/+559
	Optimised implementations for single and double precision, Advanced SIMD and SVE, copied from Arm Optimized Routines. Log lookup table added as HIDDEN symbol to allow it to be shared between AdvSIMD and SVE variants. As previously, data tables are used via a barrier to prevent overly aggressive constant inlining. Special-case handlers are marked NOINLINE to avoid incurring the penalty of switching call standards unnecessarily. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30	aarch64: Add vector implementations of sin routines	Joe Ramsay	12	-6/+426
	Optimised implementations for single and double precision, Advanced SIMD and SVE, copied from Arm Optimized Routines. As previously, data tables are used via a barrier to prevent overly aggressive constant inlining. Special-case handlers are marked NOINLINE to avoid incurring the penalty of switching call standards unnecessarily. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30	aarch64: Add vector implementations of cos routines	Joe Ramsay	10	-117/+608
	Replace the loop-over-scalar placeholder routines with optimised implementations from Arm Optimized Routines (AOR). Also add some headers containing utilities for aarch64 libmvec routines, and update libm-test-ulps. Data tables for new routines are used via a pointer with a barrier on it, in order to prevent overly aggressive constant inlining in GCC. This allows a single adrp, combined with offset loads, to be used for every constant in the table. Special-case handlers are marked NOINLINE in order to confine the save/restore overhead of switching from vector to normal calling standard. This way we only incur the extra memory access in the exceptional cases. NOINLINE definitions have been moved to math_private.h in order to reduce duplication. AOR exposes a config option, WANT_SIMD_EXCEPT, to enable selective masking (and later fixing up) of invalid lanes, in order to trigger fp exceptions correctly (AdvSIMD only). This is tested and maintained in AOR, however it is configured off at source level here for performance reasons. We keep the WANT_SIMD_EXCEPT blocks in routine sources to greatly simplify the upstreaming process from AOR to glibc. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-02	Fix a few more typos I missed in previous round -- BZ 25337	Paul Pluzhnikov	1	-1/+1

2023-05-30	Fix misspellings in sysdeps/ -- BZ 25337	Paul Pluzhnikov	2	-5/+5

2023-05-05	aarch64: More configure checks for libmvec	Szabolcs Nagy	2	-6/+48
	Check assembler and linker support too, not just SVE ACLE in the compiler, since variant PCS requires at least binutils 2.32.1.
2023-05-05	aarch64: SVE ACLE configure test cleanups	Szabolcs Nagy	2	-16/+27
	Use more idiomatic configure test for better autoconf cache and logs.
2023-05-04	aarch64: fix SVE ACLE check for bootstrap glibc builds	Szabolcs Nagy	2	-2/+2
	arm_sve.h depends on stdint.h but that relies on libc headers unless compiled in freestanding mode. Without this change a bootstrap glibc build (that uses a compiler without installed libc headers) failed with checking for availability of SVE ACLE... In file included from [...]/arm_sve.h:28, from conftest.c:1: [...]/stdint.h:9:16: fatal error: stdint.h: No such file or directory 9 \| # include_next <stdint.h> \| ^~~~~~~~~~ compilation terminated. configure: error: mathvec is enabled but compiler does not have SVE ACLE. [...]
2023-05-03	Enable libmvec support for AArch64	Joe Ramsay	25	-0/+910
	This patch enables libmvec on AArch64. The proposed change is mainly implementing build infrastructure to add the new routines to ABI, tests and benchmarks. I have demonstrated how this all fits together by adding implementations for vector cos, in both single and double precision, targeting both Advanced SIMD and SVE. The implementations of the routines themselves are just loops over the scalar routine from libm for now, as we are more concerned with getting the plumbing right at this point. We plan to contribute vector routines from the Arm Optimized Routines repo that are compliant with requirements described in the libmvec wiki. Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising the minimum GCC by such a big jump, we allow users to disable libmvec if their compiler is too old. Note that at this point users have to manually call the vector math functions. This seems to be acceptable to some downstream users. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-02-24	aarch64: update libm test ulps	Szabolcs Nagy	1	-0/+1