Age | Commit message (Collapse) | Author | Files | Lines |
|
- cpuid bit for prefetchi is different from Intel
(https://docs.amd.com/v/u/en-US/24594_3.37)
- Fix cpu family model numbers
|
|
The 256-bit maximum vector register size control was removed from AVX10
whitepaper, ref: https://cdrdv2.intel.com/v1/dl/getContent/784343
We have warned these options in LLVM21 through #132542. This patch
removes underlying implementations in LLVM22.
|
|
Use Hex Encoding for CPUID family to match number format with Intel ISE
rev.58:
https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
The new `sys::detail::getHostCPUNameForARM` for Windows (#151596) was
implemented using a C++ bit-field, which caused the associated unit
tests to fail on big-endian machines as it assumed a little-endian
layout.
This change switches from the C++ bit-field to LLVM's `BitField` type
instead.
|
|
Uses the `CP 4000` registry keys under
`HKLM\HARDWARE\DESCRIPTION\System\CentralProcessor\*` to get the
Implementer and Part, which is then provided to a modified form of
`getHostCPUNameForARM` to map to a CPU.
On my local Surface Pro 11 `llc --version` reports:
```
> .\build\bin\llc.exe --version
LLVM (http://llvm.org/):
LLVM version 22.0.0git
Optimized build with assertions.
Default target: aarch64-pc-windows-msvc
Host CPU: oryon-1
```
|
|
These changes allow LLVM and Clang to be built with Clang targeting
Arm64EC using the MSVC linker.
Built with these options:
```
-DLLVM_ENABLE_PROJECTS="clang"
-DLLVM_HOST_TRIPLE=arm64ec-pc-windows-msvc
-DCMAKE_C_COMPILER=clang-cl.exe
-DCMAKE_C_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_CXX_COMPILER=clang-cl.exe
-DCMAKE_CXX_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_LINKER_TYPE=MSVC
```
|
|
getHostCPUFeatures constructs and returns a temporary instance of
StringMap<bool>. We don't need const on the return type.
|
|
Add parsing of some crypto features to display them properly when
-mcpu=native is used
|
|
This patch adds support for -mcpu=gb10 (NVIDIA GB10). This is a
big.LITTLE cluster of Cortex-X925 and Cortex-A725 cores. The appropriate
MIDR numbers are added to detect them in -mcpu=native.
We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does
include the crypto instructions which we want on by default, and the
current convention is to not enable such extensions for Arm Cortex cores
in -mcpu where they are optional in the IP.
Relevant GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687005.html
|
|
While we are at it, this patch switches to a range-based for loop.
|
|
We can get the `mvendorid/marchid/mimpid` via hwprobe and then we
can compare these IDs with those defined in processors to find the
CPU name.
With this change, `-mcpu/-mtune=native` can set the proper name.
|
|
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.
For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320
---------
Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
|
|
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.
This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
|
|
This patch adds support for the NVIDIA Olympus core.
This does not add any special tuning decisions, and those may come
later.
|
|
This enables -mcpu=native for the HiFive Premier P550 board.
|
|
Two options for clang
-mno-scq: Disable sc.q instruction.
-mscq: Enable sc.q instruction.
The default is -mno-scq.
|
|
This patch adds support for the next-generation arch15
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch15 as host processor.
- Assembler/disassembler support for new instructions.
- Exploitation of new instructions for code generation.
- New vector (signed|unsigned|bool) __int128 data types.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10305.
Note: No currently available Z system supports the arch15
architecture. Once new systems become available, the
official system name will be added as supported -march name.
|
|
Add Apple M4 host detection, which fixes
https://github.com/rust-lang/rust/issues/133414.
Also add support for older ARM families (this is likely never going to
get used, since only macOS is officially supported as host OS, but nice
to have for completeness sake). Error handling (checking
`CPUFAMILY_UNKNOWN`) is also included here.
Finally, add links to extra documentation to make it easier for others
to update this in the future.
NOTE: These values are taken from `mach/machine.h` the Xcode 16.2 SDK,
and has been confirmed on an M4 Max in
https://github.com/rust-lang/rust/issues/133414#issuecomment-2499123337.
|
|
StringRef separator. NFC
|
|
Part numbers taken from:
https://github.com/AsahiLinux/m1n1/blob/main/src/chickens.c
Reviewers: ahmedbougacha, jroelofs
Reviewed By: jroelofs
Pull Request: https://github.com/llvm/llvm-project/pull/119777
|
|
This patch adds initial support for FUJITSU-MONAKA CPU (-mcpu=fujitsu-monaka).
The scheduling model will be corrected in the future.
|
|
|
|
Fixes: #118205
|
|
Two options for clang: -mlamcas & -mno-lamcas.
Enable or disable amcas[_db].{b/h} instructions.
The default is -mno-lamcas.
Only works on LoongArch64.
|
|
with inputs not signed-extended. (#116764)
Two options for clang
-mdiv32: Use div.w[u] and mod.w[u] instructions with input not
sign-extended.
-mno-div32: Do not use div.w[u] and mod.w[u] instructions with input not
sign-extended.
The default is -mno-div32.
|
|
|
|
0x700. (#116762)
Two options for clang
-mld-seq-sa: Do not generate load-load barrier instructions (dbar 0x700)
-mno-ld-seq-sa: Generate load-load barrier instructions (dbar 0x700)
The default is -mno-ld-seq-sa
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
This reverts commit 826b845c9e97448395431be3e4e5da585bd98c5e.
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
Two features (i.e. `frecipe` and `lam-bh`) are added to
`sys.getHostCPUFeatures`. More features will be added in future.
In addition, this patch adds the features returned by
`sys.getHostCPUFeature` when `-march=native`.
|
|
(#115467)
…eatures in cpu info
Relands #97749. Fixed test by adding additional checks for system linux
and target == host.
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
Resolve compile fail without SSE2.
|
|
Reverts llvm/llvm-project#114070
Reason: Causes `immintrin.h` to fail to compile if `-msse` and
`-mno-sse2` are passed to clang:
https://github.com/llvm/llvm-project/pull/114070#issuecomment-2465926700
|
|
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
(#114066)
…features in cpu info (#97749)"
This reverts commit d732c0b13c55259177f2936516b6087d634078e0.
This is breaking buildbots
https://lab.llvm.org/buildbot/#/builders/190/builds/8413,
https://lab.llvm.org/buildbot/#/builders/56/builds/10880 and a few
others.
|
|
info (#97749)
Add getHostCPUFeatures into the AArch64 Target Parser to query the
cpuinfo for the device in the case where we are compiling with
-mcpu=native.
Add LLVM_CPUINFO environment variable to test mock /proc/cpuinfo
files for -mcpu=native
Co-authored-by: Elvina Yakubova <eyakubova@nvidia.com>
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/671368
|
|
STAR-MC1 is an Armv8m CPU.
Technical specifications available at:
https://www.armchina.com/download/Documents/Application-Notes/Technical-Reference-Manual?infoId=160
|
|
Very simple one liner, adds the missing detection for the Llano family
which is essentially a refreshed K10:
Documentation of the family id:
https://en.wikichip.org/wiki/amd/cpuid#Family_18_.2812h.29
Documentation that it fits into amdfam10:
https://en.wikipedia.org/wiki/AMD_10h#12h
|
|
This patch is the follow-up of
https://github.com/llvm/llvm-project/pull/94352 with some updates:
1. Add support for more extensions for `zve*`, `zimop`, `zc*`, `zcmop`
and `zawrs`.
2. Use `RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF` to check whether the
processor supports fast misaligned scalar memory access.
https://github.com/llvm/llvm-project/pull/108551 reminds me that the
patch
https://lore.kernel.org/all/20240809214444.3257596-1-evan@rivosinc.com/T/
has been merged. Address comment
https://github.com/llvm/llvm-project/pull/94352#discussion_r1626056015.
References:
1. constants:
https://github.com/torvalds/linux/blame/v6.11-rc7/arch/riscv/include/uapi/asm/hwprobe.h
2. https://docs.kernel.org/arch/riscv/hwprobe.html
3. Related commits:
1. `zve*` support:
https://github.com/torvalds/linux/commit/de8f8282a969d0b7342702f355886aab3b14043d
2. `zimop` support:
https://github.com/torvalds/linux/commit/36f8960de887a5e2811c5d1c0517cfa6f419c1c4
3. `zc*` support:
https://github.com/torvalds/linux/commit/0ad70db5eb21e50ed693fa274bea0346de453e29
4. `zcmop` support:
https://github.com/torvalds/linux/commit/fc078ea317cc856c1e82997da7e8fd4d6da7aa29
5. `zawrs` support:
https://github.com/torvalds/linux/commit/244c18fbf64a33d152645766a033b2935ab0acb5
6. scalar misaligned perf:
https://github.com/torvalds/linux/commit/c42e2f076769c9c1bc5f3f0aa1c2032558e76647
and
https://github.com/torvalds/linux/commit/1f5288874de776412041022607513ffac74ae1a6
|
|
In https://github.com/llvm/llvm-project/issues/90365 it was reported
that TargetParser arrives at the wrong conclusion regarding what
features are enabled when attempting to detect "native" features on the
Raspberry Pi 4, because it (correctly) detects it as a Cortex-A72, but
LLVM (incorrectly) believes all Cortex-A72s have crypto enabled. Attempt
to help ourselves by allowing runtime information derived from the host
to contradict whatever we believe is "true" about the architecture.
|
|
The extension has been ratified for some time, but we kept it
experimental (see #99898) due to
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/444>. The
ABI issue has been resolved by #101023 so I believe there's no known
barrier to moving Zacas to non-experimental.
|
|
This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.
|
|
instructions (#101452)" (#101616)
Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
|