Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds direct code-gen support for a faster MOD intrinsic for
REAL types. Flang has maintained and keeps maintaining a high-precision
implementation of the MOD intrinsic as part of the Fortran runtime. With
the -ffast-real-mod flag, users can opt to avoid calling into the
Fortran runtime, but instead trigger code-gen that produces faster code
by avoiding the runtime call, at the expense of potentially risking bit
cancelation by having the compiler use the MOD formula a specified by
ISO Fortran.
|
|
`lookupRuntimeDefinition` assumed that a process would handle only one
TU. This is not true for unit tests, for instance. Multiple snippets of
code get parsed, and their AST are unloaded each time.
Since the cache relies on pointers as keys, if the same address happens
to be reused between runs, the cache would return a stale pointer,
potentially causing a segmentation fault. This is not that unlikely if
the snippets are similar, which would trigger similar allocation
patterns.
CPP-4889
|
|
This change moves the getUsualDeleteParams function into the
FunctionDecl class so that it can be shared between LLVM IR and CIR
codegen.
|
|
This patch updates the frontend to support version 1.2 of root
signatures, it adds parsing, metadata generation and a few tests.
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
|
|
Update the `operator%` overload that accepts `CharUnits` to return
`CharUnits` to match the other `operator%`. This is more logical than
returning an `int64` and cleans up users that want to continue to do
math with the result.
Many users of this were explicitly comparing against 0. I considered
updating these to compare against `CharUnits::Zero` or even introducing
an `explicit operator bool()`, but they all feel clearer if we update
them to use the existing `isMultipleOf()` function instead.
|
|
Update cir::CreateRealOp to make it visible on scalars
Issue #160568
|
|
(#161431)
This mirrors incubator changes from https://github.com/llvm/clangir/pull/1922
|
|
HLSLResource.h added by #161254 builds in the context of a .cpp file
(e.g. CGHLSLRuntime.cpp) but not when doing a header compilation, e.g.:
```
clang/include/clang/AST/Attrs.inc:12:45: error: unknown type name 'raw_ostream'; did you mean 'clang::raw_ostream'?
12 | static inline void DelimitAttributeArgument(raw_ostream& OS, bool& IsFirst) {
```
|
|
(#159689)
This patch accelerates complex division by passing
`-complex-range=basic` to the frontend when the `-ffast-math` option is
specified. This behavior is the same as `-fcomplex-arithmetic=basic`. A
warning is issued if a different value is specified for
`-fcomplex-arithmetic=`. The warning conditions will be unified with
clang.
|
|
attributes (#161254)
Add new `ResourceBindingAttrs` struct that holds resource binding attributes `HLSLResourceBindingAttr` and `HLSLVkBindingAttr` and provides helper methods to simplify dealing with resource bindings. This code is placed in the AST library to be shared between Sema and CodeGen.
This change has been done in preparation of a third binding attribute coming soon to represent `[[vk::counter_binding()]]`. This new attribute and more helper member functions will be added to `ResourceBindingAttrs` and will be used in both Sema and in CodeGen to implement resource counter initialization.
|
|
This adds support for handling global variables with non-trivial
constructors. The constructor call is emitted in CIR as a 'ctor' region
associated with the global definition. This form of global definition
cannot be lowered to LLVM IR yet.
A later change will add support in LoweringPrepare to move the ctor code
into a __cxx_global_var_init() function and add that function to the
list of global global ctors, but for now we must stop at the initial CIR
generation.
|
|
|
|
Upstream the RTTI builder with helpers and used them in the VTable
Definitions
Issue https://github.com/llvm/llvm-project/issues/154992
|
|
Update generated docs for legacy attributes:
* no_sanitize_(address|thread|memory)
* no_address_safety_analysis
Those are older forms of no_sanitize("list", "of", "sanitizers")
attribute. They were previously as various spellings of the same
attribute, which made the auto-generated documentation confusing.
Fix this by explicitly making them three different attributes. This
would also allow to simplify the delegation to the new no_sanitize form
slightly, as we can instead rely on auto-generated code to check that
TSan and MSan can't be disabled for globals.
**HTML docs before:**
<img width="1004" height="1175" alt="rendered-docs-before"
src="https://github.com/user-attachments/assets/407b5fc1-799c-4882-8ff8-44a5ef3cf4f1"
/>
**HTML docs after:**
<img width="1098" height="1118" alt="rendered-docs-after"
src="https://github.com/user-attachments/assets/236ca93f-25f8-4d58-95ac-ede95ce18d01"
/>
---------
Co-authored-by: Erich Keane <ekeane@nvidia.com>
|
|
This PR virtualizes module cache pruning via the new `ModuleCache`
interface. Currently this is an NFC, but I left a FIXME in
`InProcessModuleCache` to make this more efficient for the dependency
scanner.
|
|
Define the __dmr2048 type to represent the DMR pair introduced by the
Dense Math Facility on PowerPC, and add three Clang builtins
corresponding to DMF cryptography:
__builtin_mma_dmsha2hash
__builtin_mma_dmsha3hash
__builtin_mma_dmxxshapad
The __dmr2048 type is required for the dmsha3hash crypto builtin, and,
as withother PPC MMA and DMR types, its use is strongly restricted.
|
|
As a next step to generating pointer/array recipes, this patch generates
just the 'alloca' lines that are necessary. Copying pointers over to
restore the structure is held off to the next patch.
In the case of a pointer, we need to allocate the level 'below' it (if
we index into it), then copy the values into the pointers. In the case
of an array, we skip the alloca (since the array's alloca contains the
value).
After this, we'll need a patch that copies the pointers into place, and
finally one that does the initialization of these values.
|
|
This is an updated version of @vgvassilev's PR from last year here:
https://github.com/llvm/llvm-project/pull/94166
In short, it includes:
1. The fix for a blocking issue where `clang::Interpreter` (and thus
`clang-repl`) cannot resolve symbols defined in a PCH
2. A test to prove this is working
3. A new hidden flag for `clang-repl` so that `llvm-lit` can match the
host JIT triple between the PCH and `clang-repl`; previously, they may
differ in some cases
4. Everything based on the latest LLVM main
Shout out to @kylc for finding a logic issue which had us stumped for a
while (and securing the
[bounty](https://github.com/jank-lang/jank/issues/446)).
---------
Co-authored-by: Vassil Vassilev <v.g.vassilev@gmail.com>
Co-authored-by: Kyle Cesare <kcesare@gmail.com>
|
|
Fixed intrinsic VPDP[SS,SU,UU]D[,S]_128/256/512's argument types to match with the ISA.
Fixes part of #97271.
|
|
There is a tradition to use U.S. English spellings for APIs. For
example, it's uninitialized_fill and not uninitialised_fill,
specialization not specialisation, etcetera.
|
|
This adds an `operator<<` overload for `StreamingDiagnostic` that takes
an `APInt`/`APSInt` and formats it with default options, including
adding separators.
This is still an opt-in mechanism since all callers that want to use
this feature need to be changed from
```c++
Diag() << toString(MyInt, 10);
```
to
```c++
Diag() << MyInt;
```
This patch contains one example of a diagnostic making use of this.
|
|
Fixes #161070
---
This PR addresses the issue in `ext_decl_attrs_on_lambda` by using
`%0`=_attribute name_ and `%1`=_selector_, which prevents a null
`IdentifierInfo*`.
https://github.com/llvm/llvm-project/blob/48a6f2f85c8269d8326c185016801a4eb8d5dfd6/clang/lib/Parse/ParseExprCXX.cpp#L1299-L1302
https://github.com/llvm/llvm-project/blob/48a6f2f85c8269d8326c185016801a4eb8d5dfd6/clang/include/clang/Basic/DiagnosticParseKinds.td#L1143-L1145
https://github.com/llvm/llvm-project/blob/48a6f2f85c8269d8326c185016801a4eb8d5dfd6/clang/include/clang/Basic/DiagnosticParseKinds.td#L1149-L1152
|
|
This adds support for ctor and dtor regions in cir::GlobalOp. These
regions are used to capture the code that initializes and cleans up the
variable, keeping this initialization and cleanup code with the variable
definition.
This change only adds the CIR dialect support for these regions. Support
for generating the code in these regions from source and lowering these
to LLVM IR will be added in a later change, as will LoweringPrepare
support to move the code into the __cxx_global_var_init() function.
|
|
This adds basic operator delete handling in CIR. This does not yet
handle destroying delete or array delete, which will be added later. It
also does not insert non-null checks when not optimizing for size.
|
|
Adding a new builtin type for AMDGPU's image descriptor rsrc data type
This requires for https://github.com/llvm/llvm-project/pull/140210
|
|
This makes the instantation depth limit be checked whenever the code
synthesis context is pushed, not only when creating a
InstantiatingTemplate RAII object.
Also fix the note suggesting the user increases `-ftemplate-depth` so it
is printed even in a SFINAE context.
|
|
transformation directive and "looprange" clause (#139293)
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
|
|
Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
|
|
This simplifies those transforms a lot, removing a bunch of workarounds
which were introducing problems.
The transforms become independent of the template instantiator, so they
are moved to TreeTransform instead.
Fixes #131342
This PR was already reviewed and approved at
https://github.com/llvm/llvm-project/pull/160777, but I accidentally
merged that into another PR, instead of main.
|
|
Setting the prescriptiveness of the num_threads clause to 'strict' and
having a corresponding check (with message and severity clauses) does
not align well with how OpenMP should be handled for GPUs.
The num_threads expression may be an arbitrary integer expression which
is evaluated on the target, in correspondance to the OpenMP spec. This
prevents the check from being done before launching the kernel,
especially considering that the num_threads clause is associated with
the parallel directive and that there may be multiple parallel
directives with different num_threads clauses in a single target region.
Acting on the result of the 'strict' check on the GPU would require
doing I/O on the GPU, which can introduce performance regressions.
Delaying any actions resulting from the 'strict' check and doing them on
the host after executing the target region involves additional data
copies and is not really semantically correct.
For now, the 'strict' modifier for the num_threads clause and its
associated message and severity clause are set to be unsupported on
GPUs. Targets other than GPUs still support the aforementioned features
in the context of an OpenMP target region.
|
|
This PR loads the path from `-fembed-offload-object=<path>` through the
VFS rather than going straight to the real file system. This matches the
behavior of other input files of the compiler. This technically changes
behavior in that `-fembed-offload-object=-` no longer loads the file
from stdin, but I don't think that was the intention of the original
code anyways.
|
|
This PR changes `llvm::FileCollector` to use the `llvm::vfs::FileSystem`
API for making file paths absolute instead of using
`llvm::sys::fs::make_absolute()` directly. This matches the behavior of
the compiler on most other input files.
|
|
Currently, RVV/SVE intrinsics are cached, but the corresponding type
construction is not. As a result, `ASTContext::getScalableVectorType`
can become a performance hotspot, since every query must run through a
long sequence of type checks and macro expansions.
|
|
Enable the generation of no-loop kernels for Fortran OpenMP code. target
teams distribute parallel do pragmas can be promoted to no-loop kernels
if the user adds the -fopenmp-assume-teams-oversubscription and
-fopenmp-assume-threads-oversubscription flags.
If the OpenMP kernel contains reduction or num_teams clauses, it is not
promoted to no-loop mode.
The global OpenMP device RTL oversubscription flags no longer force
no-loop code generation for Fortran.
|
|
This flags enables the compiler to generate most of the debug
information in a separate file which can be useful for executable size
and link times. Clang already supports this flag.
I have tried to follow the logic of the clang implementation where
possible. Some functions were moved where they could be used by both
clang and flang. The `addOtherOptions` was renamed to `addDebugOptions`
to better reflect its purpose.
Clang also set the `splitDebugFilename` field of the `DICompileUnit` in
the IR when this option is present. That part is currently missing from
this patch and will come in a follow-up PR.
|
|
Add support for `lifetimebound` attributes in the lifetime safety
analysis to track loans from function parameters to return values.
Implemented support for `lifetimebound` attributes on function
parameters
This change replaces the single `AssignOriginFact` with two separate
operations: `OriginFlowFact` and `KillOriginFact`. The key difference is
in semantics:
* Old `AssignOriginFact`: Replaced the destination origin's loans
entirely with the source origin's loans.
* New `OriginFlowFact`: Can now optionally merge the source origin's
loans to the destination's existing loans.
* New `KillOriginFact`: Clears all loans from an origin.
For function calls with `lifetimebound` parameters, we kill the the
return value' origin first then use `OriginFlowFact` to accumulate loans
from multiple parameters into the return value's origin - enabling
tracking multiple lifetimebound arguments.
- Added a new `LifetimeAnnotations.h/cpp` to provide helper functions
for inspecting and inferring lifetime annotations
- Moved several functions from `CheckExprLifetime.cpp` to the new file
to make them reusable
The `lifetimebound` attribute is a key mechanism for expressing lifetime
dependencies between function parameters and return values. This change
enables the lifetime safety analysis to properly track these
dependencies, allowing it to detect more potential dangling reference
issues.
|
|
This change adds two new properties to each `result` object in the SARIF
log:
`partialFingerprints`: Contains the "issue hash" that the analyzer
already generates for each result, which can help identify a result
across runs even if surrounding code changes.
`hostedViewUri`: If running with `-analyzer-format=sarif-html`, this
property will now be emitted with the `file:` URL of the generated HTML
report for that result.
Along the way, I discovered an existing bug where the HTML diagnostic
consumer does not record the path to the generated report if another
compilation already created that report. This caused both the SARIF and
Plist consumers to be missing the link to the file in all but one of the
compilations in case of a warning in a header file. I added a new test
to ensure that the generated SARIF for each compilation contains the
property.
Finally, I made a few changes to the `normalize_sarif` processing in the
tests. I switched to `sed` to allow substitutions. The normalization now
removes directory components from `file:` URLs, replaces the `length`
property of the source file with a constant `-1`, and puts placeholders
in the values of the `version` properties rather than just deleting
them. The URL transformation in particular lets us verify that the right
filename is generated for each HTML report.
Fixes #158159
rdar://160410408
|
|
base-ptrs. (#155625)
These have been pulled out of the codegen PR #153683, to reduce the size
of that PR.
|
|
Addresses #99132.
|
|
(#160080)
Clarify the documentation in Builtins.def by quoting literal attribute
markers (e.g. 'n', 'r', 'U', 'N') to distinguish them from placeholders
such as N in C<N,...>. This avoids confusion and makes the attribute
docs clearer.
|
|
This patch teaches clang accepts gnu_printf, gnu_scanf, gnu_strftime and
gnu_strfmon. These attributes are aliases for printf, scanf, strftime and
strfmon.
Ref: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
Fixes: #16219
---------
Co-authored-by: Sirraide <aeternalmail@gmail.com>
|
|
This implements the easy parts of P2686R5.
Ie allowing constexpr structured binding of structs and arrays.
References to constexpr variables / support for tuple is left for a
future PR.
Until we implement the whole thing, the feature is not enabled as an
extension in older language modes.
Trying to use it as a tuple does produce errors but not meaningful ones.
We could add a better diagnostic if we fail to complete the
implementation before the end of the clang 22 cycle.
|
|
Adds HLSL function `NonUniformResourceIndex` to `hlsl_intrinsics.h.` The function calls a builtin `__builtin_hlsl_resource_nonuniformindex` which gets translated to LLVM intrinsic `llvm.{dx|spv}.resource_nonuniformindex`.
Closes #157923
|
|
After previous implementation, I discovered that we were both doing
arrays incorrectly for recipes, plus didn't get the pointer allocations
done correctly. This patch is the first of a few in a series that
attempts to make sure we get all pointers/arrays correct.
This patch is limited to just 'private' and destructors, which
simplifies the review significantly. Destructors are simply looped
through and called at each level.
The 'recipe-decl' is the 'least bounded' (that is, the type of the
expression, in the type of `int[5] i; #pragma acc parallel
private(i[1])`, the type of the `recipe-decl` is `int`. This allows
us to do init/destruction at the element level.
This patch also adds infrastructure for the rest of the series of
private (for the init section), as well as extensive testing for
'private', with a lot of 'TODO' locations.
Future patches will fill these in, but at the moment, there is an NYI
warning for bounds, so a number of tests are updated to handle that.
|
|
On new targets like `gfx1250`, the buffer resource (V#) now uses this
format:
```
base (57-bit): resource[56:0]
num_records (45-bit): resource[101:57]
reserved (6-bit): resource[107:102]
stride (14-bit): resource[121:108]
```
This PR changes the type of `num_records` from `i32` to `i64` in both
builtin and intrinsic, and also adds the support for lowering the new
format.
Fixes SWDEV-554034.
---------
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
|
|
Explain why a type is not abstract. Handles arrays, refs, unions,
pointers, and functions. If the non-abstract type has abstract base
classes, point out that their pure virtual methods must have been
overridden.
Adds onto #141911
|
|
In DEF_TRAVERSE_TMPL_SPEC_DECL, we attempt to skip implicit
instantiations by detecting that D->getTemplateArgsAsWritten() returns
nullptr. Previously, this was not reliable. To ensure we do not regress,
add an assertion and a test.
|
|
pack intrinsics to be used in constexpr (#156003)
Fixes #154283
|
|
complex types are passed to _builtin_os_log_format (#158744)
This change fixes the crash in clang's CodeGen by erroring out in Sema
if those arguments are passed.
rdar://139824423
|
|
fixes #109839
This change is really simple. It creates a matrix alias that will let
HLSL use the existing clang `matrix_type` infra.
The only additional change was to add explict alias for the typed
dimensions of 1-4 inclusive matricies available in HLSL.
Testing therefore is limited to exercising the alias, sema errors, and
basic codegen.
future work will add things like constructors and accessors.
The main difference in this attempt is the type printer and less of an
emphasis on tests where things overlap with existing `matrix_type`
testing like cast behavior.
|