Age | Commit message (Collapse) | Author | Files | Lines |
|
Clang and other frontends generally need the LLVM data layout string in
order to generate LLVM IR modules for LLVM. MLIR clients often need it
as well, since MLIR users often lower to LLVM IR.
Before this change, the LLVM datalayout string was computed in the
LLVM${TGT}CodeGen library in the relevant TargetMachine subclass.
However, none of the logic for computing the data layout string requires
any details of code generation. Clients who want to avoid duplicating
this information were forced to link in LLVMCodeGen and all registered
targets, leading to bloated binaries. This happened in PR #145899,
which measurably increased binary size for some of our users.
By moving this information to the TargetParser library, we
can delete the duplicate datalayout strings in Clang, and retain the
ability to generate IR for unregistered targets.
This is intended to be a very mechanical LLVM-only change, but there is
an immediately obvious follow-up to clang, which will be prepared
separately.
The vast majority of data layouts are computable with two inputs: the
triple and the "ABI name". There is only one exception, NVPTX, which has
a cl::opt to enable short device pointers. I invented a "shortptr" ABI
name to pass this option through the target independent interface.
Everything else fits. Mips is a bit awkward because it uses a special
MipsABIInfo abstraction, which includes members with codegen-like
concepts like ABI physical registers that can't live in TargetParser. I
think the string logic of looking for "n32" "n64" etc is reasonable to
duplicate. We have plenty of other minor duplication to preserve
layering.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
|
|
(#151071)
Fixes #139023.
This PR essentially removes unused global variables:
- Restores the `GlobalDCE` Legacy pass and adds it to the DirectX
backend after the finalize linkage pass
- Converts external global variables with no usage to internal linkage
in the finalize linkage pass
- (so they can be removed by `GlobalDCE`)
- Makes the `dxil-finalize-linkage` pass usable using the new pass
manager flag syntax
- Adds tests to `finalize_linkage.ll` that make sure unused global
variables are removed
- Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it
isn't removed
- Changes the `scalar-data.ll` run command to avoid removing its global
variables
---------
Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>
|
|
fixes #151764
This fix has two parts first we track all lifetime intrinsics and if
they are users of an alloca of a target extention like dx.RawBuffer then
we eliminate those memory intrinsics when we visit the alloca.
We do step one to allow us to use the Dead Store Elimination Pass. This
removes the alloca and simplifies the use of the target extention back
to using just the global. That keeps things in a form the
DXILBitcodeWriter is expecting.
Obviously to pull this off we needed to bring back the legacy pass
manager plumbing for the DSE pass and hook it up into the DirectX
backend.
The net impact of this change is that DML shader pass rate went from
89.72% (4268 successful compilations) to 90.98% (4328 successful
compilations).
|
|
Fixes #145924 and #140416
Depends on #146173 being merged first.
This PR moves the scalarizer pass to immediately before the
dxil-flatten-arrays pass to allow the dxil-flatten-arrays pass to turn
scalar GEPs (including i8 GEPs) into flattened array GEPs where
applicable.
A number of LLVM DirectX CodeGen tests have been edited to remove scalar
GEPs and also correct previously uncaught incorrectly-transformed GEPs.
No more validation errors of the form `Access to out-of-bounds memory is
disallowed` or `TGSM pointers must originate from an unambiguous TGSM
global variable` appear anymore after this PR when compiling DML
shaders.
|
|
Analysis (#140981)
Moving `DXILResourceImplicitBinding` pass and the associated `DXILResourceBindingAnalysis` lower in the llc pipeline to just before the DXIL Resource Analysis, which is where its results are first needed, and adjusting the set of analyses it preserves.
The reason for this change is that I will soon be adding `DXILResourceBindingAnalysis` dependency to `DXILPostOptimizationValidation` pass and bringing this closer to where it is needed avoid unnecessary churn to preserved analysis setting in preceding passes.
|
|
(#139562)
Move dxil resource access legacy pass before intrinsic expansion legacy
pass so TypedBuffer Loads and Stores will be created before intrinsic
expansion.
This is to facilitate #104423
|
|
The `DXILResourceImplicitBinding` pass uses the results of
`DXILResourceBindingAnalysis` to assigns register slots to resources
that do not have explicit binding. It replaces all
`llvm.dx.resource.handlefromimplicitbinding` calls with
`llvm.dx.resource.handlefrombinding` using the newly assigned binding.
If a binding cannot be found for a resource, the pass will raise a
diagnostic error. Currently this diagnostic message does not include the
resource name, which will be addressed in a separate task (#137868).
Part 2/2 of #136786
Closes #136786
|
|
Fixes #135672
Raise a diagnostic in the post optimization validation pass as defined
in
https://github.com/llvm/wg-hlsl/blob/main/proposals/0022-resource-instance-analysis.md
|
|
(#138199)
This moves the responsibility for cleaning up dead intrinsics from
DXILFinalizeLinkage to DXILOpLowering, and moves DXILFinalizeLinkage
back to it's pre-#136244 place in the pipeline. Doing this avoids issues
with DXIL passes running on obviously dead code, and makes it more clear
what DXILFinalizeLinkage is really doing.
This also helps with the story for #134260, as cleaning up dead
intrinsics doesn't make sense if this becomes a more generic pass.
Note that test/CodeGen/DirectX/remove-dead-intriniscs.ll already covers
most of the testing here. It'd be nice to have something that catches
the regression from changing the pass ordering but I couldn't come up
with anything that wouldn't be incredibly fragile.
Fixes #138180.
|
|
fixes #136243
This change converts memset into a series of geps and stores It is
intentionally limited to memsets of fixed size It also converts the byte
stores to type stores.
DXIL does not support i8 plus this reduces the total number of gep and
store instructions.
This change also moves DXILFinalizeLinkage to run after Legalization to
clean up any dead intrinsic definitions.
|
|
Replace "concept based polymorphism" with simpler PImpl idiom.
This pursues two goals:
* Enforce static type checking. Previously, target implementations hid
base class methods and type checking was impossible. Now that they
override the methods, the compiler will complain on mismatched
signatures.
* Make the code easier to navigate. Previously, if you asked your
favorite LSP server to show a method (e.g. `getInstructionCost()`), it
would show you methods from `TTI`, `TTI::Concept`, `TTI::Model`,
`TTIImplBase`, and target overrides. Now it is two less :)
There are three commits to hopefully simplify the review.
The first commit removes `TTI::Model`. This is done by deriving
`TargetTransformInfoImplBase` from `TTI::Concept`. This is possible
because they implement the same set of interfaces with identical
signatures.
The first commit makes `TargetTransformImplBase` polymorphic, which
means all derived classes should `override` its methods. This is done in
second commit to make the first one smaller. It appeared infeasible to
extract this into a separate PR because the first commit landed
separately would result in tons of `-Woverloaded-virtual` warnings (and
break `-Werror` builds).
The third commit eliminates `TTI::Concept` by merging it with the only
derived class `TargetTransformImplBase`. This commit could be extracted
into a separate PR, but it touches the same lines in
`TargetTransformInfoImpl.h` (removes `override` added by the second
commit and adds `virtual`), so I thought it may make sense to land these
two commits together.
Pull Request: https://github.com/llvm/llvm-project/pull/136674
|
|
This pass attempts to forward resource handle creation to accesses of
the handle global. This avoids dependence on optimizations like CSE and
GlobalOpt for correctness of DXIL.
Fixes #134574.
|
|
This introduces a pass that walks accesses to globals in cbuffers and
replaces them with accesses via the cbuffer handle itself. The logic to
interpret the cbuffer metadata is kept in `lib/Frontend/HLSL` so that it
can be reused by other consumers of that metadata.
Fixes #124630.
|
|
- Remove calls to pass initialization from pass constructors.
- https://github.com/llvm/llvm-project/issues/111767
|
|
- Legalize i8 truncation back to original types
- remove sext and truncs
- Legalize i64 indicies for insert\extract elements to i32 indicies
- fixes https://github.com/llvm/llvm-project/issues/126323
- fixes https://github.com/llvm/llvm-project/issues/129757
|
|
Removing `DXILResourceMDAnalysis` that gathers information about
resources for the `DXILTranslateMetadata` pass. It collects the info
based on obsolete resource metadata annotations that are going to be
removed soon.
Part 1/2 of #114126
|
|
computeRegisterProperties (#128818)
This fixes #126784 for the DirectX backend.
This bug was marked critical for DX so it is going to go in first. At
least one register class needs to be added via `addRegisterClass` for
`RegClassForVT` to be valid.
Further for costing information used by loop unroll and other
optimizations to be valid we need to call `computeRegisterProperties`.
This change does both of these.
The test cases confirm that we can fetch costing information off of
`getRegisterInfo` and that `DirectXTargetLowering` maps `i32` typed
registers to `DXILClassRegClass`.
|
|
Adding support for Root Signature Flags Element extraction and writing
to DXContainer.
- Adding an analysis to deal with RootSignature metadata definition
- Adding validation for Flag
- writing RootSignature blob into DXIL
Closes: [126632](https://github.com/llvm/llvm-project/issues/126632)
---------
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
|
|
Move the DXILOpLoweringPass after DXILTranslateMetadata, and add asserts
in DXILShaderFlags to ensure it isn't scheduled after op lowering. This
will allow us to rely on DirectX intrinsics in the shader flags analysis
rather than having to recover information from lowered operations.
Fixes #120119.
|
|
This pass transforms resource access via `llvm.dx.resource.getpointer`
into buffer loads and stores.
Fixes #114848.
|
|
This moves DXILFinalizeLinkage before the DXIL op lowering passes so
that it doesn't end up internalizing any of the `dx.op.*` functions.
This also exposed a bug when the pass is run on a module with intrinsics
in them - marking the intrinsics as internal will fail the validator.
Fixes #117761
|
|
Targets (#116290)
This PR fixes a set of build issues with experimental targets happened
in result of merging #111234 to master.
|
|
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.
cc @arsenm @aeubanks
|
|
- Relevant piece is `DXILFlattenArrays.cpp`
- Loads and Store Instruction visits are just for finding
GetElementPtrConstantExpr and splitting them.
- Allocas needed to be replaced with flattened allocas.
- Global arrays were similar to allocas. Only interesting piece here is
around initializers.
- Most of the work went into building correct GEP chains. The approach
here was a recursive strategy via `recursivelyCollectGEPs`.
- All intermediary GEPs get marked for deletion and only the leaf GEPs
get updated with the new index.
fixes [89646](https://github.com/llvm/llvm-project/issues/89646)
|
|
Preserve the existing defaults (although load-store defaulting
to false is a really bad one). Also migrate DirectX tests to new PM.
|
|
This change adds a pass to scalarize vectors in global scope into
arrays.
There are three distinct parts
1. find the globals that need to be updated and define what the new type
should be
2. initialize that new type and copy over all the right attributes over
from the old type.
3. Use the instruction visitor pattern to update the loads, stores, and
geps for the layout of the new data structure.
resolves https://github.com/llvm/llvm-project/issues/107920
|
|
backend (#107427)
As discussed in this
[proposal](https://github.com/llvm/wg-hlsl/pull/62/files?short_path=ac6e592#diff-ac6e59276afe8016e307eedc5c835f534c0cb353707760b44df0fa9d905a5cf8).
We had to bring back the legacy pass manager interface for the
scalarizer pass. Two reasons for this:
1. The DirectX backend is still using the legacy pass manager
2. The new PM isn't hooked up in clang yet via `BackendUtil.cpp`'s
`AddEmitPasses` That means even if we add a `buildCodeGenPipeline` we
won't be able to benefit from the new pass manager's scalarizer pass
interface.
The remaining changes are hooking up the scalarizer pass to the DirectX
backend, updating the DirectX test cases,
and allowing the `optdriver` to not block the legacy invocation of the
scalarizer pass.
Future work still needs to be done to allow the scalarizer pass to
handle target specific intrinsics.
closes #105178
|
|
This wires up dxil-op-lower, dxil-intrinsic-expansion, dxil-translate-metadata,
and dxil-pretty-printer to the new pass manager, both as a matter of future
proofing the backend and so that they can be used more flexibly in tests.
A few arbitrary tests are updated in order to test the new PM path, and we drop
the "print-dxil-resource-md" pass since it's redundant with the pretty printer.
Pull Request: https://github.com/llvm/llvm-project/pull/104250
|
|
An HLSL function has internal linkage by default unless it is:
1. shader entry point function
2. marked with the `export` keyword
(https://github.com/llvm/llvm-project/issues/92812)
3. patch constant function (not implemented yet)
This PR adds a link-time pass `DXILFinalizeLinkage` that updates the
linkage of functions to make sure only shader entry points and exported
functions are visible from the module (have _program linkage_). All
other functions will be updated to have internal linkage.
Related spec update: microsoft/hlsl-specs#295
Fixes #llvm/llvm-project#92071
|
|
These passes will be replaced soon as we move to the target extension based
resource handling in the DirectX backend, but removing them now before the
replacement stuff is all up and running would be very disruptive. However, we
do need to move these passes out of the way to avoid symbol conflicts with the
new DXILResourceAnalysis in the Analysis library.
Note: I tried an even simpler hack in #100698 but it doesn't really work. A
rename is the most expedient path forward here.
Pull Request: https://github.com/llvm/llvm-project/pull/101393
|
|
(#96462)
On MSVC the `this` uses inside `decltype` require a lambda capture. On
clang they result in an unused capture warning instead. Add the capture
and suppress the warning with `(void)this`.
-----
Initializing this map is somewhat expensive (especially for O0), so we
currently only do it if certain flags are used. I would like to make use
of it for crash dumps (#96078), where we don't know in advance whether
it will be needed or not.
This patch changes the initialization to a lazy approach, where a
callback is registered that does the actual initialization. The
callbacks will be run the first time the pass name is requested.
This way there is no compile-time impact if the mapping is not used.
|
|
My attempt to fix the Windows build made things worse,
revert entirely for now.
This reverts commit e7137f2fed5cfee822ae3c4c6d39188adb59a16c.
This reverts commit 6eaf204dbb0a6a81cddfd02f625c130f7bb1aae5.
This reverts commit 957dc4366dd2ce9d5d2991c3ad76bbf438e9954e.
|
|
|
|
Remove string function attribute other than
"waveops-include-helper-lanes" and "fp32-denorm-mode".
Move DXILPrepareModulePass after DXILTranslateMetadataPass since
DXILTranslateMetadataPass needs to use attribute like hlsl.numthreads.
Fixes #90773
|
|
Prepare migration for dag-isel
|
|
This change implements lowering for #70076, #70100, #70072, & #70102
`CGBuiltin.cpp` - - simplify `lerp` intrinsic
`IntrinsicsDirectX.td` - simplify `lerp` intrinsic
`SemaChecking.cpp` - remove unnecessary check
`DXILIntrinsicExpansion.*` - add intrinsic to instruction expansion
cases
`DXILOpLowering.cpp` - make sure `DXILIntrinsicExpansion` happens first
`DirectX.h` - changes to support new pass
`DirectXTargetMachine.cpp` - changes to support new pass
Why `any`, and `lerp` as instruction expansion just for DXIL?
- SPIR-V there is an
[OpAny](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpAny)
- SPIR-V has a GLSL lerp extension via
[Fmix](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#FMix)
Why `exp` instruction expansion?
- We have an `exp2` opcode and `exp` reuses that opcode. So instruction
expansion is a convenient way to do preprocessing.
- Further SPIR-V has a GLSL exp extension via
[Exp](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp)
and
[Exp2](https://registry.khronos.org/SPIR-V/specs/1.0/GLSL.std.450.html#Exp2)
Why `rcp` as instruction expansion?
This one is a bit of the odd man out and might have to move to
`cgbuiltins` when we better understand SPIRV requirements. However I
included it because it seems like [fast math mode has an AllowRecip
flag](https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_fp_fast_math_mode)
which lets you compute the reciprocal without performing the division.
We don't have that in DXIL so thought to include it.
|
|
`print-pipeline-passes` can show target pass names.
|
|
(#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
|
|
When emitting assembly we don't particularly want the binary DXIL
embedded in the output. This was mostly there for testing purposes, so
we update those tests to run the test directly using `opt` and
restrict the -dxil-embed and -dxil-globals passes to running normally
only in the case where we're trying to emit a DXContainer.
Differential Revision: https://reviews.llvm.org/D158051
|
|
DXIL doesn't need uselist strtab and symtab blocks which are not supported by llvm3.7 bitcode.
Differential Revision: https://reviews.llvm.org/D141328
|
|
|
|
|
|
Fix build error caused by createPrintModulePass moving to diffrent
header.
|
|
This diff splits out (from LLVMCore) IR printing passes into IRPrinter.
This structure is similar to what we already have for IRReader and
enables us to avoid circular dependencies between LLVMCore and Analysis
(this is a preparation for https://reviews.llvm.org/D137768).
The legacy interface is left unchanged, once the legacy pass manager
is removed (in the future) we will be able to clean it up further.
The bazel build configuration has been updated as well.
Test plan:
1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc
2/ bazel build --config=generic_clang @llvm-project//...
Differential revision: https://reviews.llvm.org/D138081
|
|
DXContainer files have a handful of sections that need to be written.
This adds a pass to write the section data into IR globals, and writes
the shader flag data into a global.
The test cases here verify that the shader flags are correctly written
from the IR into the global and emitted to the DXContainer.
This change also fixes a bug in the MCDXContainerWriter, where the size
of the dxbc::ProgramHeader was not being included in the part offset
calcuations. This is verified to be working by the new testcases where
obj2yaml can properly dump part data for parts after the DXIL part.
Resolves issue #57742 (https://github.com/llvm/llvm-project/issues/57742)
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D135793
|
|
When DXC prints IR output it adds a bunch of IR comments in a header
that describe the DXIL metadata in a more human-readable format. This
pass will serve that purpose for LLVM by printing out ahead of the IR
printer.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D135802
|
|
This adds infrastructural pieces for an analysis to compute the DXIL
shader flags. In this state the analysis can compute two fairly
straightforward feature flags for use of double-precision floating
point values and the DX 11.1 extended double support.
This patch does conflict with D135190, conflicts will be resolved prior
to merging.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D135393
# Conflicts:
# llvm/lib/Target/DirectX/CMakeLists.txt
# llvm/lib/Target/DirectX/DirectXTargetMachine.cpp
|
|
Now only DXILTranslateMetadata uses DXILResources, so DXILResourceWrapper is only used by DXILTranslateMetadata.
Once we add lower for createHandle, DXILResourceWrapper will be used in more passes.
Also we can add resource index allocation in DXILResourceWrapper.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D135190
|
|
When emit-obj from clang directly, DirectX backend will hit assert caused by not initialize passes for AsmPrinter.
The fix will initialize the passes by calling createPassConfig.
Also ignore global variable which not has section in DXILAsmPrinter::emitGlobalVariable to avoid hit llvm_unreachable in DXILTargetObjectFile::SelectSectionForGlobal.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D130856
|
|
This is the last piece to bring together writing DXContainer files
containing DXIL through the DirectX backend.
While this change only has one test, all of the tests under
llvm/test/tools/dxil-dis also exercise this code. With this change the
output object file type for the dxil target is now DXContainer. Each of
the existing tests will generate DXContainer files, and the dxil-dis
tests additionally verify that the DXContainers generated are
well-formed and can be parsed by the DirectXShaderCompiler tools.
Depends on D127153 and D127165
Differential Revision: https://reviews.llvm.org/D127166
|