Age | Commit message (Collapse) | Author | Files | Lines |
|
This is a fairly large changeset, but it can be broken into a few
pieces:
- `llvm/Support/*TargetParser*` are all moved from the LLVM Support
component into a new LLVM Component called "TargetParser". This
potentially enables using tablegen to maintain this information, as
is shown in https://reviews.llvm.org/D137517. This cannot currently
be done, as llvm-tblgen relies on LLVM's Support component.
- This also moves two files from Support which use and depend on
information in the TargetParser:
- `llvm/Support/Host.{h,cpp}` which contains functions for inspecting
the current Host machine for info about it, primarily to support
getting the host triple, but also for `-mcpu=native` support in e.g.
Clang. This is fairly tightly intertwined with the information in
`X86TargetParser.h`, so keeping them in the same component makes
sense.
- `llvm/ADT/Triple.h` and `llvm/Support/Triple.cpp`, which contains
the target triple parser and representation. This is very intertwined
with the Arm target parser, because the arm architecture version
appears in canonical triples on arm platforms.
- I moved the relevant unittests to their own directory.
And so, we end up with a single component that has all the information
about the following, which to me seems like a unified component:
- Triples that LLVM Knows about
- Architecture names and CPUs that LLVM knows about
- CPU detection logic for LLVM
Given this, I have also moved `RISCVISAInfo.h` into this component, as
it seems to me to be part of that same set of functionality.
If you get link errors in your components after this patch, you likely
need to add TargetParser into LLVM_LINK_COMPONENTS in CMake.
Differential Revision: https://reviews.llvm.org/D137838
|
|
This change is rather more invasive than intended. The main intention
here is to make CommandLine.cpp not rely on llvm/Support/Host.h. Right
now, this reliance is only in 3 superficial places:
- Choosing how to expand response files (in two places)
- Printing the default triple and current CPU in `--version` output.
The built in version system has a method for adding "extra version
printers", commonly used by several tools (such as llc) to report the
registered targets in the built version of LLVM. It was reasonably easy
to move the logic for printing the default triple and current CPU into
a similar function, and register it with any relevant binaries.
The incompatible change here is that now, even if
LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO is defined, most binaries
will no longer print out the default target triple and cpu when provided
with `--version`, for instance llvm-as and llvm-dis. This breakage is
intended, but the changes in this patch keep printing the default target
and detected in `llc` and `opt` as these were remarked as important
binaries in the LLVM install.
The change to expanding response files may also be controversial, but I
believe that these macros should correspond exactly to the host triple
introspection used before.
Differential Revision: https://reviews.llvm.org/D137837
|
|
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D139073
|
|
d16 was removed in https://reviews.llvm.org/D60691.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D139304
|
|
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
The big change here is that now if you have threading disabled,
`llvm::get_physical_cores()` will return -1, as if it had not been able
to work out the right info. This is due to how Threading.cpp includes
OS-specific code/headers. This seems ok, as if threading is disabled,
LLVM should not need to know the number of physical cores.
Differential Revision: https://reviews.llvm.org/D137836
|
|
This reverts commit 5577207d6d3e0642ea047a8dfbfcf3ad372a7f25.
This breaks building LLVM on recent macOS. Error messages below:
llvm/lib/Support/Threading.cpp:190:3: error: use of undeclared
identifier 'sysctlbyname'
sysctlbyname("hw.physicalcpu", &count, &len, NULL, 0);
^
llvm/lib/Support/Threading.cpp:193:13: error: use of undeclared
identifier 'CTL_HW'
nm[0] = CTL_HW;
^
llvm/lib/Support/Threading.cpp:194:13: error: use of undeclared identifier 'HW_AVAILCPU'
nm[1] = HW_AVAILCPU;
^
llvm/lib/Support/Threading.cpp:195:5: error: use of undeclared identifier 'sysctl'
sysctl(nm, 2, &count, &len, NULL, 0);
^
|
|
This change is focussed on simplifying `Support/Host.h` to only do
target detection. In this case, this function is close in usage to
existing functions in `Support/Threading.h`, so I moved it into there.
The function is also renamed to `llvm::get_physical_cores()` to match
the style of threading's functions.
Differential Revision: https://reviews.llvm.org/D137836
|
|
Windows computeHostNumPhysicalCores is defined by Threading.cpp. Leave it
unchanged.
|
|
computeHostNumPhysicalCores/computeHostNumHardwareThreads"
This reverts commit 9969ceb36b440eaafa17c486f29a69c7a7da3b3b.
On Windows:
lld-link: error: undefined symbol: int __cdecl computeHostNumPhysicalCores(void)
>>> referenced by LLVMSupport.lib(Support.Host.obj):(int __cdecl llvm::sys::getHostNumPhysicalCores(void))
|
|
|
|
cortex-x2.
I noticed these were missing, so this adds Host identifiers for
cortex-a55, cortex-a510, cortex-a710 and cortex-x2, taken from their
respective TRMs.
Differential Revision: https://reviews.llvm.org/D138497
|
|
This patch adds support for detecting all current Zen/Zen3+ submodels
Based off a mixture of https://github.com/torvalds/linux/blob/master/drivers/hwmon/k10temp.c#L436 and InstLatx64 https://github.com/InstLatx64/InstLatx64/tree/master/AuthenticAMD CPUID dumps and confirmed by @GGanesh
Differential Revision: https://reviews.llvm.org/D137695
|
|
Cortex-X3 is an Armv9-A AArch64 CPU.
This patch introduces support for Cortex-X3.
Technical Reference Manual: https://developer.arm.com/documentation/101593/latest
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D136589
|
|
Reviewed By: skan, pengfei, MaskRay
Differential Revision: https://reviews.llvm.org/D137153
|
|
|
|
Reviewed By: pengfei, skan, MaskRay
Differential Revision: https://reviews.llvm.org/D135937
|
|
Cortex-A715 is an Armv9-A AArch64 CPU.
This patch introduces support for Cortex-A715.
Technical Reference Manual: https://developer.arm.com/documentation/101590/latest.
Reviewed By: vhscampos
Differential Revision: https://reviews.llvm.org/D136957
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D135930
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D135938
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D135932
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Initial authored by Liu Chen (@LiuChen3)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D135951
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D135933
|
|
Fixes #58545
|
|
Differential Revision: https://reviews.llvm.org/D135941
|
|
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D136040
|
|
Adds support for the Neoverse V2 CPU to the AArch64 backend.
Differential Revision: https://reviews.llvm.org/D134352
|
|
non-Android Linux with no custom implementation
Make the sched_getaffinity based implementation available to all architectures
(except s390x/x86 which have a custom implementation). The `CPU_ALLOC(2048)`
code supports all `CONFIG_NR_CPUS` values in Linux kernel `arch/*/configs/`.
The function is mainly used by in-process ThinLTO to decide the default number
of threads. Returning -1 will use just one thread.
Android is excluded because of the higher API level requirement:
`sched_getaffinity; # introduced-arm=12 introduced-arm64=21 introduced-x86=12 introduced-x86_64=21`
|
|
As mentioned on D128934 - we weren't including the CPUID bit handling for the RDPRU instruction
AMD's APMv3 (24594) lists it as CPUID Fn8000_0008_EBX Bit#4
|
|
With C++17 there is no Clang pedantic warning or MSVC C5051.
|
|
While working on D118450 <https://reviews.llvm.org/D118450>, I noticed that
`sys::getHostCPUName` lacks SPARC support.
This patch implements it. The code is taken from/inspired by GCC's
`gcc/config/sparc/driver-sparc.cc`. There's one caveat: since LLVM, unlike
GCC, doesn't support the SPARC-M7, -S7, and -M8 CPUs, I map all those to
the latest supported one (UltraSparc T4/`niagara4`).
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu` by
running `savcov --version` on
- Netra SPARC S7-2 (SPARC-S7, Solaris 11.4)
- SPARC T5-2 (SPARC T5, Solaris 11.4)
- SPARC Enterprise T5220 (UltraSPARC T2, Solaris 11.3)
- SPARC T5 (UltraSPARC T5, Debian sid)
- SPARC T3 (UltraSPARC T3, Debian sid)
- SPARC Enterprise T5220 (Debian sid)
Differential Revision: https://reviews.llvm.org/D130272
|
|
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D123978
|
|
Map the IMPLEMENTOR ID 0xc0 (Ampere Computing) and CPU ID 0xac3
(Ampere1) to ampere1.
Differential Revision: https://reviews.llvm.org/D117111
|
|
The recently announced IBM z16 processor implements the architecture
already supported as "arch14" in LLVM. This patch adds support for
"z16" as an alternate architecture name for arch14.
|
|
This reverts commit fc3cdd0b295a04c38f01b391ae414553963e33b9.
The issue was imports being scoped to specific architectures for Apple
platforms.
|
|
This reverts commit fcca10c69aaab539962d10fcc59a5f074b73b0de.
|
|
This improves the getHostCPUName check for Apple M1 CPUs, which
previously would always be considered cyclone instead. This also enables
`-march=native` support when building on M1 CPUs which would previously
fail. This isn't as sophisticated as the X86 CPU feature checking which
consults the CPU via getHostCPUFeatures, but this is still better than
before. This CPU selection could also be invalid if this was run on an
iOS device instead, ideally we can improve those cases as they come up.
Differential Revision: https://reviews.llvm.org/D119788
|
|
|
|
While `-march=` is correctly detected as `znver3` for the cpu,
apparently the model check is incorrect:
```
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5950X 16-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 0
Frequency boost: disabled
CPU max MHz: 6017.8462
CPU min MHz: 2200.0000
BogoMIPS: 8050.07
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse
3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_p
state ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbn
oinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 8 MiB (16 instances)
L3: 64 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling
Srbds: Not affected
Tsx async abort: Not affected
```
Model is 33 (0x21), while the code was expecting it to be `0x00 .. 0x1F`.
https://github.com/torvalds/linux/blob/v5.17-rc8/drivers/hwmon/k10temp.c#L432-L453 agrees.
I'm not sure if other ranges listed here should also be accepted.
I noticed this while implementing CPU model detection
for halide (https://github.com/halide/Halide/pull/6648)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D121708
|
|
This also includes a few cleanup from Support.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121331
|
|
Add detecton for lse, sve and sve2 on linux
Differential Revision: https://reviews.llvm.org/D119435
|
|
This patch upstreams support for the Arm-v8 Cortex-X1C processor for AArch64 and
ARM.
For more information, see:
- https://community.arm.com/arm-community-blogs/b/announcements/posts/arm-cortex-x1c
- https://developer.arm.com/documentation/101968/0002/Functional-description/Technical-overview/Components
The following people contributed to this patch:
- Simon Tatham
- Ties Stuij
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D117202
|
|
Map Main ID part number 0xd40 to neoverse-v1, as described in the
Neoverse-V1 Technical Reference Manual:
https://developer.arm.com/documentation/101427/0101/Register-descriptions/AArch64-system-registers/MIDR-EL1--Main-ID-Register--EL1
Differential Revision: https://reviews.llvm.org/D117207
|
|
Identified with modernize-use-nullptr.
|
|
The RISCV target doesn't define a "generic" cpu, only "generic-rv32" and
"generic-rv64". Define sys::getHostCPUName for RISC-V that returns the
correct cpu for the host.
Reviewed By: craig.topper, MaskRay
Differential Revision: https://reviews.llvm.org/D105274
|
|
d8faf03807ac implemented general-regs-only for X86 by disabling all features
with vector instructions. But the CRC32 instruction in SSE4.2 ISA, which uses
only GPRs, also becomes unavailable. This patch adds a CRC32 feature for this
instruction and allows it to be used with general-regs-only.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D105462
|
|
1. Enable FP16 type support and basic declarations used by following patches.
2. Enable new instructions VMOVW and VMOVSH.
Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D105263
|
|
Differential Revision: https://reviews.llvm.org/D107245
|
|
This patch adds support for the next-generation arch14
CPU architecture to the SystemZ backend.
This includes:
- Basic support for the new processor and its features.
- Detection of arch14 as host processor.
- Assembler/disassembler support for new instructions.
- New LLVM intrinsics for certain new instructions.
- Support for low-level builtins mapped to new LLVM intrinsics.
- New high-level intrinsics in vecintrin.h.
- Indicate support by defining __VEC__ == 10304.
Note: No currently available Z system supports the arch14
architecture. Once new systems become available, the
official system name will be added as supported -march name.
|
|
Code in getCPUNameFromS390Model currently assumes that the
numerical value of the model number always increases with
future hardware. While this has happened to be the case
with the last few machines, it is not guaranteed -- that
assumption was violated with (much) older machines, and
it can be violated again with future machines.
Fix by explicitly listing model numbers for all supported
machine models.
|
|
Fix a bug that `computeHostNumPhysicalCores` is fallback to default
unknown when building for Apple Silicon macs.
rdar://80533675
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106012
|