aboutsummaryrefslogtreecommitdiff
path: root/newlib/libc/machine
AgeCommit message (Collapse)AuthorFilesLines
2025-07-25newlib: libc: return back support for AArch64 ILP32Radek Bartoň14-10/+64
This patch is returning back support for AArch64 ILP32 ABI that was removed in de479a54e22e8fcb6262639a8e67fe8b00a27c37 commit but is needed to ensure source code compatibility with GCC 14. The change in newlib/libc/machine/aarch64/asmdefs.h makes it out-of-the-sync with the current upstream implementation in https://github.com/ARM-software/optimized-routines repository. Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
2025-07-25Revert Joel's working ilp32 patchJoel Sherrill16-108/+54
This was what I was using locally before Radek Bartoň <radek.barton@microsoft.com> had his version of the patch. Revert in favor of his final version. Revert 70c5505766ad4ae62e4d045835ed2a6b928d5760
2025-07-25ilp32: Revert patch removing ilp32 supportJoel Sherrill16-54/+108
ilp32 support was removed prematurely. It is still in GCC 15 which is the latest GCC release. From: <radek.barton@microsoft.com> Date: Thu, 5 Jun 2025 11:32:08 +0200 Subject: [PATCH] newlib: libc: update asmdefs.h compatible with Cygwin AArch64 This patch synchronizes newlib/libc/machine/aarch64/asmdefs.h header with version from https://github.com/ARM-software/optimized-routines/commit/4352245388a55a836f3ac9ac5907022c24ab8e4c commit that added support for AArch64 Cygwin. This version of the header removed PTR_ARG and SIZE_ARG macros as ILP32 was deprecated which introduced changes in many .S files so the patch contains removal of usages of those macros. On top of that, setjmp.S and rawmemchr.S were refactored to use ENTRY/ENTRY_ALIGN and END macros. ` Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
2025-07-21newlib: add dummy implementations of fe{get,set}prec for Aarch64 CygwinRadek Bartoň1-0/+15
This patch introduces dummy implementation of fegetprec and fsetprec for Cygwin build as those symbols are being exported by cygwin1.dll and AArch64 do not support changing floating point precision at runtime. Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
2025-07-17nvptx: Change 'read' and 'write' to 'ssize_t' return typeArijit Kumar Das2-2/+3
This commit changes the return type of the read() and write() syscalls for nvptx to ssize_t. This would allow large files to be handled properly by these syscalls in situations where the read/write buffer length exceeds INT_MAX, for example. This also makes the syscall signatures fully complaint with their current POSIX specifications. We additionally define two macros: '_READ_WRITE_RETURN_TYPE' as _ssize_t and '_READ_WRITE_BUFSIZE_TYPE' as __size_t in libc/include/sys/config.h under __nvptx__ for consistency. Signed-off-by: Arijit Kumar Das <arijitkdgit.official@gmail.com>
2025-07-17newlib: libc: update asmdefs.h compatible with Cygwin AArch64Radek Bartoň16-108/+54
This patch synchronizes newlib/libc/machine/aarch64/asmdefs.h header with version from https://github.com/ARM-software/optimized-routines/commit/4352245388a55a836f3ac9ac5907022c24ab8e4c commit that added support for AArch64 Cygwin. This version of the header removed PTR_ARG and SIZE_ARG macros as ILP32 was deprecated which introduced changes in many .S files so the patch contains removal of usages of those macros. On top of that, setjmp.S and rawmemchr.S were refactored to use ENTRY/ENTRY_ALIGN and END macros. ` Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
2025-07-10RISC-V: memmove() speed optimized: Call memcpy()m fally1-67/+93
Redirect to memcpy() if the memory areas of source and destination do not overlap. Only redirect if length is > SZREG in order to reduce overhead on very short copies. Signed-off-by: m fally <marlene.fally@gmail.com>
2025-07-10RISC-V: memmove() speed optimized: Align source addressm fally1-50/+135
If misaligned accesses are slow or prohibited, either source or destination address are unaligned and the number of bytes to be copied is > SZREG*2, align the source address to xlen. This speeds up the function in the case where at least one address is unaligned, since now one word (or doubleword for rv64) is loaded at a time, therefore reducing the amount of memory accesses necessary. We still need to store back individual bytes since the destination address might (still) be unaligned after aligning the source. The threshold of SZREG*2 was chosen to keep the negative effect on shorter copies caused by the additional overhead from aligning the source low. This change also affects the case where both adresses are xlen- aligned, the memory areas overlap destructively, and length is not a multiple of SZREG. In the destructive-overlap case, the copying needs to be done in reversed order. Therefore the length is added to the addresses first, which causes them to become unaligned. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-07-10RISC-V: memmove() speed optimized: Add loop-unrollingm fally1-9/+48
Add loop-unrolling for the case where both source and destination address are aligned in the case of a destructive overlap, and increase the unroll factor from 4 to 9 for the word-by-word copy loop in the non-destructive case. This matches the loop-unrolling done in memcpy() and increases performance for lenghts >= SZREG*9 while almost not at all degrading performance for shorter lengths. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-07-10RISC-V: memmove() speed optimized: Replace macros and use fixed-width typesm fally1-25/+35
Replace macros with static inline functions or RISC-V specifc macros in order to keep consistency between all functions in the port. Change data types to fixed-width and/or RISC-V specific types. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-07-10RISC-V: memmove() speed optimized: Add implementationm fally4-15/+100
Copy the common implementation of memmove() to the RISC-V port. Rename memmove.S to memmove-asm.S to keep naming of files consistent between functions. Update Makefile.inc with the changed filenames. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-07-04libc: mips: fix strcmp bug for little endian targetsFaraz Shahbazker1-2/+5
strcmp gives incorrect result for little endian targets under the following conditions: 1. Length of 1st string is 1 less than a multiple of 4 (i.e len%4=3) 2. First string is a prefix of the second string 3. The first differing character in the second string is extended ASCII (that is > 127) Signed-off-by: Jovan Dmitrović <jovan.dmitrovic@htecgroup.com>
2025-07-04libc: mips: Improve performance of strcmp implementationFaraz Shahbazker1-79/+123
Improve `strcmp` by using `ext` instruction, if available. Signed-off-by: Jovan Dmitrović <jovan.dmitrovic@htecgroup.com>
2025-07-04libc: mips: memcpy prefetches beyond copied memoryFaraz Shahbazker1-53/+97
Fix prefetching in core loop to avoid exceeding the operated upon memory region. Revert accidentally changed prefetch-hint back to streaming mode. Refactor various bits and provide pre-processor checks to allow parameters to be overridden from compiler command line. Signed-off-by: Jovan Dmitrović <jovan.dmitrovic@htecgroup.com>
2025-07-04libc: mips: Add improved C implementation of memcpy/memsetFaraz Shahbazker3-1/+582
Signed-off-by: Jovan Dmitrović <jovan.dmitrovic@htecgroup.com>
2025-07-04mips: Implement MIPS HAL and UHIJovan Dmitrović4-119/+135
Implement abstract interface for MIPS, including unified hosting interface (UHI). Signed-off-by: Jovan Dmitrović <jovan.dmitrovic@htecgroup.com>
2025-07-04aarch64: Export fe{enable,disable,get}except on CygwinRadek Bartoň1-4/+12
For aarch64 on ELF targets, the library does not export fe{enable,disable,get}except as symbols from the library, relying on static inline functions to provide suitable definitions if required. But for Cygwin we need to create real definitions to satisfy the DLL export script. So arrange for real definitions of these functions when building on Cygwin. Signed-off-by: Radek Bartoň <radek.barton@microsoft.com>
2025-07-02RISC-V: Fix memcpy() for GCC 13Sebastian Huber1-2/+2
GCC 13 does not define the __riscv_misaligned_* builtin defines. They are supported by GCC 14 or later. Test for __riscv_misaligned_fast to select an always correct memcpy() implementation for GCC 13. Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
2025-06-02RISC-V: strcmp [speed optimized]: optimize mismatch logic for targets with ↵puranikvinit1-63/+94
Zb* extension support Reworks the mismatch handling to use Zbb's ctz/clz instructions for faster byte difference detection, significantly improving performance on Zbb-capable targets. Non-Zbb targets retain the original logic for compatibility. Signed-off-by: puranikvinit <kvp933.vinit@gmail.com> Reviewed-by: Christian Herber <christian.herber@oss.nxp.com>
2025-06-02RISC-V: strcmp [speed optimized]: optimize null detect logic for targets ↵puranikvinit1-4/+8
with Zb* extension support Introduces conditional use of the orc.b instruction from the Zbb extension for null byte detection, falling back to the original logic for non-Zbb targets. This reduces cycles in the hot path for supported architectures. Signed-off-by: puranikvinit <kvp933.vinit@gmail.com> Reviewed-by: Christian Herber <christian.herber@oss.nxp.com>
2025-06-02RISC-V: strcmp [speed optimized]: use compressed registers wherever possiblepuranikvinit1-4/+4
Replaces temporary registers (t0) with compressed registers (a4) in the null detection loop, reducing instruction count and code size in speed-optimized builds while maintaining identical logic. Signed-off-by: puranikvinit <kvp933.vinit@gmail.com> Reviewed-by: Christian Herber <christian.herber@oss.nxp.com>
2025-06-02RISC-V: strcmp: refactor labels for improved readabilitypuranikvinit1-13/+18
Renames labels in strcmp.S to use descriptive .L prefixes (e.g., .Lcompare, .Lreturn_diff) instead of numeric labels (e.g., 1f, 2f). This improves maintainability and aligns with modern assembly conventions without affecting functionality. Signed-off-by: puranikvinit <kvp933.vinit@gmail.com> Reviewed-by: Christian Herber <christian.herber@oss.nxp.com>
2025-06-02RISC-V: setjmp: reduce code size for register load/store with Zilsdpuranikvinit1-25/+49
This patch optimizes the RISC-V setjmp implementation in newlib/libc/machine/riscv/setjmp.S for 32-bit targets. It reduces code size by using doubleword store/load instructions (sd/ld) when the Zilsd or Zclsd extensions are available for saving and restoring s0-s11 registers, while preserving the original single-word instructions (REG_S/REG_L) for compatibility with other configurations. Signed-off-by: puranikvinit <kvp933.vinit@gmail.com> Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> RISC-V: setjmp: reduce code size for register load/store with Zilsd
2025-06-02newlib: riscv: Remove undefined behavior in strlen()Eric Salem1-1/+4
Pointer arithmetic overflow is undefined behavior, so use a signed type to avoid it. Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-05-28newlib: riscv: Align whitespace of size optimized memset()Eric Salem1-4/+4
Align the whitespace of the size optimized implementation of memset() to match the speed optimized version. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-05-28newlib: riscv: Optimize memset() for speedEric Salem1-71/+262
The RISC-V Zba, Zbkb, and Zilsd/Zclsd extensions provide instructions optimized for bit and load/store operations. Use them when available for the RISC-V port. Also increase loop unrolling for faster performance. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-05-27RISC-V: memcpy() align dest when misaligned access is prohibitedMahmoud Abumandour1-12/+60
Add a code path for when source and dest are differently aligned. If misaligned access is slow or prohibited, and the alignments of the source and destination are different, we align the destination to do XLEN stores. This uses only one aligned store for every four (or eight for XLEN == 64) bytes of data. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Mahmoud Abumandour <ma.mandourr@gmail.com>
2025-05-27RISC-V: memcpy() Use inline functions instead of macros and gotosMahmoud Abumandour1-18/+24
Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Mahmoud Abumandour <ma.mandourr@gmail.com>
2025-05-27RISC-V: memcpy() Use uintxlen_t for xlen-sized copyMahmoud Abumandour1-35/+36
Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Mahmoud Abumandour <ma.mandourr@gmail.com>
2025-05-21newlib: riscv: Optimize memchr() and memrchr()Eric Salem4-83/+249
The RISC-V Zbb, Zbkb, and Zilsd extensions provide instructions optimized for bit and load/store operations. Use them when available for the RISC-V port. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-05-21newlib: riscv: Add memchr() and memrchr() implementationsEric Salem3-1/+199
Copy stock implementations of memchr() and memrchr() to the RISC-V port. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-05-02newlib: riscv: Remove unnecessary byte load for strlen()Eric Salem1-7/+7
For architectures where XLEN is 32 bits, when detecting a null byte, a word is read at a time. Once a null is found in the word, its precise location is then determined. Make clear to the compiler that if the first three bytes are not null, the last byte must be null, and does not need to be read from the string, since its value is always zero. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-04-23newlib: riscv: Fix build for rv64eEric Salem1-2/+2
Update the macro check so that rv64e builds successfully. Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-04-15newlib: riscv: Remove unnecessary byte load/store for stpcpy()/strcpy()Eric Salem1-3/+3
For architectures where XLEN is 32 bits, when detecting a null byte, a word is read at a time. Once a null is found in the word, its precise location is then determined. Make clear to the compiler that if the first three bytes are not null, the last byte must be null, and does not need to be read from the source string, since its value is always zero. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-04-11RISC-V: Size optimized versions: Replace add with addim fally4-8/+8
Replace add instructions with addi where applicable in the size optimized versions of memmove(), memset(), memcpy(), and strcmp(). This change does not affect the functions themselves and is only done to improve syntactic accuracy. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memset() size optimized version: Rename local labelsm fally1-4/+4
Rename local labels to improve readability. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memset() size optimized version: Use compressed registersEric Salem1-3/+3
Swap register t1 with a3, so that the affected instructions can be compressed. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Reviewed-by: m fally <marlene.fally@gmail.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-04-11RISC-V: memcpy() size optimized version: Use compressed registersMahmoud Abumandour1-4/+4
Replace registers t1 and t2 with registers a3 and a4 respectively, so that the affected instructions can be compressed. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Mahmoud Abumandour <ma.mandourr@gmail.com>
2025-04-11RISC-V: memcpy() size optimized version: Replace lb with lbuMahmoud Abumandour1-1/+1
Replace lb with lbu to avoid unnecessary sign extension. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Mahmoud Abumandour <ma.mandourr@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Relax RAW dependencym fally1-2/+2
Move the instruction that increments the remaining number of bytes to be copied inbetween the load and store instructions. This is done in order to relax the RAW dependency between the load and store instructions. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Replace lb with lbum fally1-3/+3
Replace lb with lbu to avoid unnecessary sign extension. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Add commentsm fally1-8/+8
Since the algorithm in this version of memmove() is different from the original version, add comments to give a description. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Reviewed-by: Eric Salem <ericsalem@gmail.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Rename local labelsm fally1-6/+6
Rename local lables so that the structure of the function is clearer. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Use compressed registers onlym fally1-8/+8
Change register t1 to register a4, so that the affected instructions can be compressed. Since now we have less registers available, the following changes need to be made: In the previous version of this function, a4 was used to hold the offset that needs to be added to source and destination addresses before copying any data in the case of source address > destination address. Since a4 now holds the destination address, this offset is not calculated anymore. Instead, the value in a2 (the number of bytes to be copied) is added to the source and destination addresses. Therefore, in the case of source address > destination adress, a value of 1 needs to be subtracted from both addresses before starting the copying process. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-11RISC-V: memmove() size optimized version: Use compressed registerm fally1-2/+2
Replace register t2 with register a5, so that lb/sb instructions can be compressed. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: m fally <marlene.fally@gmail.com>
2025-04-02newlib: riscv: Fix build and reorganize header filesEric Salem6-11/+21
The sys/asm.h header file is included for certain assembly files, so move the typedef to a separate header file due to the build breaking on some systems. Also include the port's string header file (and move and rename) instead of the system's version. Addresses: https://sourceware.org/pipermail/newlib/2025/021591.html Fixes: c3b9bb173c8c ("newlib: riscv: Add XLEN typedef and clean up types") Reported-by: Jeff Law <jlaw@ventanamicro.com> Suggested-by: Kito Cheng <kito.cheng@gmail.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-04-01RISC-V: Fix the asm code for large code modelKito Cheng1-1/+8
The large code model assume the data may far away from the code, so we must put the address of the target data wihin the `.text` section, normally we will just put within the function or nearby the function to prevent it out-of-range. Report from riscv-gnu-toolchain: https://github.com/riscv-collab/riscv-gnu-toolchain/issues/1699 Verified with riscv-gnu-toolchain with rv64gc.
2025-03-24newlib: riscv: Add stpcpy() portEric Salem4-45/+75
Add implementation of stpcpy() to the RISC-V port. Also refactor shared code between strcpy() and stpcpy() to a common function. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-03-24newlib: riscv: Optimize strlen()Eric Salem2-12/+57
The RISC-V Zbb extension provides instructions optimized for bit operations. Use them when available for the RISC-V port. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>
2025-03-24newlib: riscv: Add XLEN typedef and clean up typesEric Salem4-44/+45
The size of the long data type isn't precisely defined in the C standard, so create a new typedef that uses either uint32_t or uint64_t based on XLEN. The fixed width types are more robust against any ABI changes and fit the data types of the intrinsic functions. Use the new uintxlen_t type instead of long and uintptr_t. Reviewed-by: Christian Herber <christian.herber@oss.nxp.com> Signed-off-by: Eric Salem <ericsalem@gmail.com>