Age | Commit message (Collapse) | Author | Files | Lines |
|
Remove the Native Client support now that it has finally reached end of life.
|
|
Adds sve-sm4 to reference FEAT_SVE_SM4 without specifically enabling
SVE2.
|
|
This patch adds support for -mcpu=gb10 (NVIDIA GB10). This is a
big.LITTLE cluster of Cortex-X925 and Cortex-A725 cores. The appropriate
MIDR numbers are added to detect them in -mcpu=native.
We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does
include the crypto instructions which we want on by default, and the
current convention is to not enable such extensions for Arm Cortex cores
in -mcpu where they are optional in the IP.
Relevant GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687005.html
|
|
instructions. (#145696)
Adds sve-sha3 to reference FEAT_SVE_SHA3 without specifically enabling
SVE2. The SVE2 requirement for AES, SHA3 and Bitperm is replaced with
SVE for non-streaming function.
|
|
features (#142236)
The `targetFeatureToExtension` function used by
reconstructFromParsedFeatures only found positive `+FEATURE` strings,
but not negative `-FEATURE` strings. Extend the function to handle both
to fix `reconstructFromParsedFeatures`.
|
|
|
|
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.
For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320
---------
Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
|
|
`+simd` and `+nosimd` are used to enable or disable NEON Instructions
when compiling for ARM Targets. However, up until now, using these
has not been possible. To enable this, these options are mapped to the
relevant LLVM backend option (`+neon` and `-neon`) so it can be both
enabled and disabled successfully by the user.
Tests have been added to ensure this behaviour is maintained in the
future, along with updates to existing tests as behaviour has now changed
relating to the use of `+simd` and `+nosimd`.
As `simd` has been mapped within the ARMTargetParser.def, support for
this extension is also added for the `--print-support-extensions` command
when the target is AArch32. This will print the `simd` option, along with the
description that relates to the Neon feature. This previously was not
possible as `simd` did not have a related Feature or Negative Feature.
To make this functional as intended, MVE and MVE.FP now rely on their
own Enum identifier, rather than `AEK_SIMD`. While SIMD does refer to
both Neon and Helium technologies, in terms of command line options,
SIMD relates to Neon. Helium relates to MVE and MVE.FP. The Enum
now reflects this too.
|
|
This patch adds support for the NVIDIA Olympus core.
This does not add any special tuning decisions, and those may come
later.
|
|
Enable optional ISA extensions on Grace when mcpu=grace
is used: sve2-sm4, sve2-aes, sve2-sha3.
Grace is no longer an alias, but a separate CPU definition.
|
|
apple-a18 is an alias of apple-m4.
apple-s6/s7/s8 are aliases of apple-a13.
apple-s9/s10 are aliases of apple-a16.
As with some other aliases today, this reflects identical ISA feature
support, but not necessarily identical microarchitectures and
performance characteristics.
|
|
The patch itself is mainly the expected unittest boilerplate.
This adds tests for the aliases we have today.
We could alternatively test these via the driver with additional
run-lines in print-enable-extensions tests, and eventually should
consider that instead.
|
|
These features FEAT_FAMINMAX, FEAT_LUT and FEAT_FP8 depends on
FEAT_NEON.
Update dependency from FEAT_FP8DOT4 and FEAT_FP8DOT2. Now depends
indirectly on FEAT_NEON through FEAT_FP8
|
|
Previously, when selecting a Single Precision FPU, LLVM would ensure all
elements of the Candidate FPU matched the InputFPU that was given.
However, for cases such as Cortex-R52, there are FPU options where not
all fields match exactly, for example NEON Support or Restrictions on
the Registers available.
This change ensures that LLVM can select the FPU correctly, removing the
requirement for Neon Support and Restrictions for the Candidate FPU to
be the same as the InputFPU.
|
|
The 2024-12 ISA spec release[1] add these features:
FEAT_SME_MOP4(sme-mop4) to enable SME Quarter-tile outer product
instructions
and
FEAT_SME_TMOP(sme-tmop) to enable SME Structured sparsity outer product
instructions
to allow these instructions to be available outside Armv9.6/sme2p2
[1]
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
|
|
The 20204-12 ISA update release adds a new feature: FEAT_SSVE_BitPerm,
which allows the sve-bitperm instructions to run in streaming mode.
It also removes the requirement of FEAT_SVE2 for FEAT_SVE_BitPerm. The
sve2-bitperm feature is now an alias for sve-bitperm and sve2.
A new feature flag sve-bitperm is added to reflect the change that the
instructions under FEAT_SVE_BitPerm are supported if:
on non streaming mode with FEAT_SVE2 and FEAT_SVE_BitPerm or
in streaming mode with FEAT_SME and FEAT_SSVE_BitPerm
|
|
This patch simplifies feature dependencies of FP8 features and also adds
new tests to check these.
|
|
This patch adds initial support for FUJITSU-MONAKA CPU (-mcpu=fujitsu-monaka).
The scheduling model will be corrected in the future.
|
|
This core was originally AArch64-only, but the r1p0 revision added
optional support for AArch32 at EL0.
TRM: https://developer.arm.com/documentation/101604/0103
|
|
This patch essentially re-lands
https://github.com/llvm/llvm-project/pull/114293 with the following
fixups
- `nosve2-aes` should disable the backend feature `FeatureSVEAES` such
that the set of existing instructions that this removes is unchanged.
- FMV dependencies now use the autogenerated `ExtensionDepencies`
structure (since https://github.com/llvm/llvm-project/pull/113281) so we
do not require the change to `AArch64FMV.td`.
|
|
(#115539)
…293)"
This reverts commit da9499ebfb323602c42aeb674571fe89cec20ca6.
|
|
This patch introduces the amended feature flag for
[FEAT_SVE_AES](https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-0-architecture-extension?lang=en#md457-the-armv90-architecture-extension__feat_FEAT_SVE_AES),
'**sve-aes**'. The existing flag associated with this feature,
'sve2-aes' must be retained as an alias of 'sve-aes' and 'sve2' for
backwards compatibility.
The
[ACLE](https://github.com/ARM-software/acle/blob/main/main/acle.md#aes-extension)
documents `__ARM_FEATURE_SVE2_AES`, which was previously defined to 1
when
> there is hardware support for the SVE2 AES (FEAT_SVE_AES) instructions
and if the associated ACLE intrinsics are available.
The front-end has been amended such that it is compatible with +sve2-aes
and +sve2+sve-aes.
|
|
Add support for the following Armv9.6-A architecture extensions:
* FEAT_PoPS - Point of Physical Storage
as documented here:
https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension
Co-authored-by: Alfie Richards <alfie.richards@arm.com>
|
|
Co-authored-by: Jonathan Thackray <jonathan.thackray@arm.com>
Co-authored-by: SpencerAbson <Spencer.Abson@arm.com>
|
|
extensions (#112341)
Add support for the following Armv9.6-A memory systems extensions:
FEAT_LSUI - Unprivileged Load Store
FEAT_OCCMO - Outer Cacheable Cache Maintenance Operation
FEAT_PCDPHINT - Producer-Consumer Data Placement Hints
FEAT_SRMASK - Bitwise System Register Write Masks
as documented here:
https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension
Co-authored-by: Jonathan Thackray <jonathan.thackray@arm.com>
---------
Co-authored-by: Jonathan Thackray <jonathan.thackray@arm.com>
|
|
This patch implements new features introduced in 2024 release of ARM ISA
and creates predicates, which will be used by new instructions.
Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Spencer Abson spencer.abson@arm.com
|
|
STAR-MC1 is an Armv8m CPU.
Technical specifications available at:
https://www.armchina.com/download/Documents/Application-Notes/Technical-Reference-Manual?infoId=160
|
|
This introduces the Armv9.6-A architecture version, including the
relevant command-line option for -march.
More details about the Armv9.6-A architecture version can be found at:
* https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2024
* https://developer.arm.com/documentation/ddi0602/2024-09/
|
|
This is a partial revert of c66e1d6f3429. Even though that
allowed us to declare v9.2-a support without picking up SVE2
in both the backend and the driver, the frontend itself still
enabled SVE via the arch version's default extensions.
Avoid that by reverting back to v8.7-a while we look into
longer-term solutions.
|
|
Failure with -Werror buildbot caused by #104587
|
|
These are annoying to update, and are redundant since the tests in
clang/test/Driver/print-enabled-extensions/ were added.
|
|
This adds a check that all ExtensionWithMArch which are marked as
implied features for an architecture are also present in the list of
default features. It doesn't make sense to have something mandatory but
not on by default.
There were a number of existing cases that violated this rule, and some
changes to which features are mandatory (indicated by the Implies
field).
This resulted in a bug where if a feature was marked as `Implies` but
was not added to `DefaultExt`, then for `-march=base_arch+nofeat` the
Driver would consider `feat` to have never been added and therefore
would do nothing to disable it (no `-target-feature -feat` would be
added, but the backend would enable the feature by default because of
`Implies`). See
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c.
Note that the processor definitions do not respect the architecture
DefaultExts. These apply only when specifying `-march=<some architecture
version>`. So when a feature is moved from `Implies` to `DefaultExts` on
the Architecture definition, the feature needs to be added to all
processor definitions (that are based on that architecture) in order to
preserve the existing behaviour. I have checked the TRMs for many cases
(see specific commit messages) but in other cases I have just kept the
current behaviour and not tried to fix it.
|
|
Implement FEAT_SME_B16B16 to enable ZA-targeting non-widening SME
BFloat16 instructions. Remove the now redundant FEAT_B16B16 which has
been replaced by FEAT_SVE_B16B16 and FEAT_SME_B16B16 (this commit), see
https://github.com/llvm/llvm-project/pull/101480/ for the details and
reasoning of this change to LLVM.
FEAT_SME_B16B16 is documented under the latest Armv9.4 feature
documentation:
https://developer.arm.com/documentation/109697/0100/Feature-descriptions/The-Armv9-4-architecture-extensio
- Changes to Clang AArch64 frontend
- Change target guard of SME2 ZA-targeting non-widening BFloat16
intrinsics to 'sme-b16b16'
- Changes to LLVM AArch64 backend
- llvm/lib/Target/AArch64/AArch64Features.td
- Create FeatureSMEB16B16, which implies FeatureSME2 and
FeatureSVEB16B16
- Remove FeatureB16B16
- Fix description of FeatureSVEB16B16
- llvm/lib/Target/AArch64/AArch64InstrInfo.td
- Create HasSMEB16B16 predicate
- llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
- Change predictication of SME2 ZA-targeting non-widening BFloat16
instructions to new HasSMEB16B16
- llvm/lib/Target/AArch64/AArch64.td
- Add HasSMEB16B16 to SME2Unsupported (FEAT_SME_B16B16 implies
FEAT_SME2)
- llvm/lib/AArch64/AsmParser/AArch64AsmParser.cpp
- Remove flag 'b16b16' mapping to removed FeatureB16B16
- Add flag 'sme-b16b16' mapping to new FeatureSMEB16B16
- Changes to LLVM unit tests
- llvm/unittests/TargetParser/TargetParserTest.cpp
- Add new sme-b16b16 flag to existing target parser tests
- Add tests for the sme-b16b16 dependencies:
- 'sme-b16b16' should enable 'sme2', 'sve-b16b16'. - Remove 'b16b16'
from bf16 dependency test
- Added MC tests
- llvm/test/MC/AArch64/SME2p1
- To ensure that ZA-targeting multi-vector non-widening BFloat16
instructions are enabled by +sme-b16b16, and that this feature is
removed by +nosme-b61b6.
- Modidified tests
- All CodeGen, Semantic, and MC tests that are effected by the removal
of 'b16b16', have been modified to supply and/or expect 'sme-b16b16'
where appropriate.
|
|
(#101480)
This patch adds FeatureSVEB16B16 to the AArch64 backend in order to
represent the new behavior of FEAT_SVE_B16B16 (as described in the
latest [Armv9.4 extensions
documentation](https://developer.arm.com/documentation/109697/0100/Feature-descriptions/The-Armv9-4-architecture-extension?lang=en#md461-the-armv94-architecture-extension__FEAT_SVE_B16B16))
as well as a 'sve-b16b16' flag to enable it.
The predication of non-widening SVE BFloat16 instructions has changed to
require this feature, instead of the previously required and
soon-to-be-removed FeatureB16B16 which is enabled by the 'b16b16' flag.
Therefore, this change weakens the 'b16b16' flag in favour of
'sve-b16b16'. Existing tests that are effected by this have been
modified to use and/or expect 'sve-b16b16', and new tests have been
added to verify the behavior and implementation of 'sve-b16b16'.
This patch is in response to the response to the following changes.
The architecture features previously enabled by FEAT_SVE_B16B16 have
been relaxed such that it now implements:
- With FEAT_SVE2 : SVE non-widening BFloat16 instructions in
Non-streaming SVE mode
- With FEAT_SME2: SVE non-widening BFloat16 instructions when the
PE is in Streaming SVE mode and SME
Z-targeting multi-vector non-widening BFloat16 instructions.
- **It no longer implements** SME ZA-targeting non-widening
BFloat16 instructions.
The SME ZA-targeting non-widening BFloat16 instructions are implemented
by the new FEAT_SME_B16B16, **this patch does not change how this
architecture feature is enabled** ('+b16b16+sme2'). Only those that are
implemented by FEAT_SVE_B16B16 have been changed to require 'sve-b16b16'
instead of 'b16b16'.
New flags must be created to represent FEAT_SVE_B16B16 and
FEAT_SME_B16B16:
- 'sve-b16b16' enables the updated FEAT_SVE_B16B16 (described
here)
- 'sme-b16b16' will enable the new FEAT_SME_B16B16
- **This patch includes 'sve-b16b16' only**
A future patch will add 'sme-b16b16', SME ZA-targeting non-widening
BFloat16 instructions would then be guarded by '+sme-b16b16+sme2', and
'b16b16' can be removed.
|
|
We added FPAC recently in d7e8a7487cd7 to allow ptrauth codegen to rely
on the cpu auth failure checks rather than emitting its own auth failure
check/brk sequence.
Add it to the Apple processors that do have it: A15, A16, A17, M4.
While there, tweak the description to refer to Armv8.3-A rather than
v8.3-A, matching the other features.
|
|
But since SVE and friends have been added to the default extensions
list, and every CPU was opted into those extensions by default, we
couldn't correctly announce its architecutral version to the backend.
Additionally, we FEAT_MEC from llvm's "required" list for v9.0 to the
optional list for v9.2, as the spec considers it optional, and M4 does
not implement it. Similarly, fixes up several bugs w.r.t. FEAT_RME.
As a drive-by, I noticed that saphira did not have an
AArch64CPUTestParams entry, and thus added one.
|
|
AFAICT, the only use of the field was for the ARM side of this shared
struct.
|
|
For both overloads, we were printing the bit-pattern for ExpectedFlags
twice. While we're here, also add a convenience line that highlights the
difference between the two sets.
|
|
--print-supported-extensions (#97829)
For AArch64, we have existing tests for `--print-enabled-extensions` for
each architecture. However:
- These are added to the end of the existing tests which check for
`"-target-feature"`, which complicates them slightly.
- They do not test the descriptions printed next to each feature.
- Part of the output was tested separately in `TargetParserTest`.
- We did not have _any_ tests of this output for CPUs (only for
architectures).
Similarly, the tests for `--print-supported-extensions` do not give
complete coverage of either the full list of features or the
descriptions.
In my opinion we should be testing the full output, as this is what the
user sees. Descriptions and formatting can contain errors and be
accidentally broken.
|
|
|
|
(#97367)
There were a couple of cases where this field was just plain wrong
because we weren't actually testing against it. Instead, drop the
`CPUAttr` field on AArch64 tests.
|
|
(#95805) (#96795)
This introduces the new `--print-enabled-extensions` command line option
to AArch64, which prints the list of extensions that are enabled for the
target specified by the combination of `--target`/`-march`/`-mcpu`
values.
The goal of the this option is both to enable the manual inspection of
the enabled extensions by users and to enhance the testability of
architecture versions and CPU targets implemented in the compiler.
As part of this change, a new field for `FEAT_*` architecture feature
names was added to the TableGen entries. The output of the existing
`--print-supported-extensions` option was updated accordingly to show
these in a separate column.
|
|
Reverts llvm/llvm-project#95805 due to test failures caught by the
buildbots.
|
|
This introduces the new `--print-enabled-extensions` command line option
to AArch64, which prints the list of extensions that are enabled for the
target specified by the combination of `--target`/`-march`/`-mcpu`
values.
The goal of the this option is both to enable the manual inspection of
the enabled extensions by users and to enhance the testability of
architecture versions and CPU targets implemented in the compiler.
As part of this change, a new field for `FEAT_*` architecture feature
names was added to the TableGen entries. The output of the existing
`--print-supported-extensions` option was updated accordingly to show
these in a separate column.
|
|
... so move it out of the `implied_features` list, and into the
`DefaultExts` list.
|
|
FMV extensions are really just mappings from FMV feature names to lists
of backend features for codegen. Split them out into their own separate
file.
|
|
This is a follow up to #92037, which moved the architecture information.
Generate the AArch64TargetParser CPUInfo from tablegen Processor defs using a
new tablegen emitter. Some basic error checking is added in the emitter to
ensure that duplicate features are not added to the Processor defs.
The generic CPU becomes an entry in tablegen.
Some CPU features which were present in the CPUInfo but absent from the tablegen
defs have been added to tablegen. FeatureCrypto is replaced with FeatureSHA2 and
FeatureAES. This changes a few of the tests.
|
|
(#95579)
|
|
|
|
Cortex-A725 and Cortex-X925 are Armv9.2 AArch64 CPUs.
Technical Reference Manual for Cortex-A725:
https://developer.arm.com/documentation/107652/latest
Technical Reference Manual for Cortex-X925:
https://developer.arm.com/documentation/102807/latest
|