Age | Commit message (Collapse) | Author | Files | Lines |
|
GPU Dialect lowering to SYCL runtime is driven by spirv.target_env
attached to gpu.module. As a result of this, spirv.target_env remains as
an input to LLVMIR Translation.
A SPIRVToLLVMIRTranslation without any actual translation is added to
avoid an unregistered error in mlir-cpu-runner.
SelectObjectAttr.cpp is updated to
1) Pass binary size argument to getModuleLoadFn
2) Pass parameter count to getKernelLaunchFn
This change does not impact CUDA and ROCM usage since both
mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept
and ignore the extra arguments.
|
|
storage (#71044)
Previously, we were inserting za.enable/disable intrinsics for functions
with the "arm_za" attribute (at the MLIR level), rather than using the
backend attributes. This was done to avoid a dependency on the SME ABI
functions from compiler-rt (which have only recently been implemented).
Doing things this way did have correctness issues, for example, calling
a streaming-mode function from another streaming-mode function (both
with ZA enabled) would lead to ZA being disabled after returning to the
caller (where it should still be enabled). Fixing issues like this would
require re-doing the ABI work already done in the backend within MLIR.
Instead, this patch switches to use the "arm_new_za" (backend) attribute
for enabling ZA for an MLIR function. For the integration tests, this
requires some way of linking the SME ABI functions. This is done via the
`%arm_sme_abi_shlib` lit substitution. By default, this expands to a
stub implementation of the SME ABI functions, but this can be overridden
by providing the `ARM_SME_ABI_ROUTINES_SHLIB` CMake cache variable
(pointing it at an alternative implementation). For now, the ArmSME
integration tests pass with just stubs, as we don't make use of nested
ZA-enabled calls.
A future patch may add an option to compiler-rt to build the SME
builtins into a standalone shared library to allow easily
building/testing with the actual implementation.
|
|
This patch adds an NVPTX compilation path that enables JIT compilation
on NVIDIA targets. The following modifications were performed:
1. Adding a format field to the GPU object attribute, allowing the
translation attribute to use the correct runtime function to load the
module. Likewise, a dictionary attribute was added to add any possible
extra options.
2. Adding the `createObject` method to `GPUTargetAttrInterface`; this
method returns a GPU object from a binary string.
3. Adding the function `mgpuModuleLoadJIT`, which is only available for
NVIDIA GPUs, as there is no equivalent for AMD.
4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify
the format to use during testing.
|
|
This work introduces sm90 integration testing and adds a single test.
Depends on : D155825 D155680 D155563 D155453
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155838
|
|
With the recent addition of "-mattr" and "-march" to the list of options
supported by mlir-cpu-runner [1], the SVE integration
tests can be updated to use mlir-cpu-runner instead of lli. This will
allow better code re-use and more consistency
This patch updates 2 tests to demonstrate the new logic. The remaining
tests will be updated in the follow-up patches.
[1] https://reviews.llvm.org/D146917
Depends on D155403
Differential Revision: https://reviews.llvm.org/D155405
|
|
This patch updates one SparseTensor integration test so that the VLA
vectorisation is run conditionally based on the value of the
MLIR_RUN_ARM_SME_TESTS CMake variable.
This change opens the path to reduce the duplication of RUN lines in
"mlir/test/Integration/Dialect/SparseTensor/CPU/". ATM, there are
usually 2 RUN lines to test vectorization in SparseTensor integration
tests:
* one for VLS vectorisation,
* one for VLA vectorisation whenever that's available and which
reduces to VLS vectorisation when VLA is not supported.
When VLA is not available, VLS vectorisation is verified twice. This
duplication should be avoided - integration test are relatively
expansive to run.
This patch makes sure that the 2nd vectorisation RUN line becomes:
```
if (SVE integration tests are enabled)
run VLA vectorisation
else
return
```
This logic is implemented using LIT's (relatively new) conditional
substitution [1]. It enables us to guarantee that all RUN lines are
unique and that the VLA vectorisation is only enabled when supported.
This patch updates only 1 test to set-up and to demonstrate the logic.
Subsequent patches will update the remaining tests.
[1] https://www.llvm.org/docs/TestingGuide.html
Differential Revision: https://reviews.llvm.org/D155403
|
|
explicitly set
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D152966
|
|
This patch adds a couple of tests for targeting Arm Streaming SVE (SSVE)
mode, part of the Arm Scalable Matrix Extension (SME).
SSVE is enabled in the backend at the function boundary by specifying
the `aarch64_pstate_sm_enabled` attribute, as documented here [1]. SSVE
can be targeted from MLIR by specifying this in the passthrough
attributes [2] and compiling with
-mattr=+sme,+sve -force-streaming-compatible-sve
The passthrough will propagate to the backend where `smstart/smstop`
will be emitted around the call to the SSVE function.
The set of legal instructions changes in SSVE,
`-force-streaming-compatible-sve` avoids the use of NEON entirely and
instead lowers to (streaming-compatible) SVE. The behaviour this flag
predicates will be hooked up to the function attribute in the future
such that simply specifying this (should) lead to correct
code-generation.
Two tests are added:
* A basic LLVMIR test verifying the attribute is passed through.
* An integration test calling a SSVE function.
The integration test can be run with QEMU.
[1] https://llvm.org/docs/AArch64SME.html
[2] https://mlir.llvm.org/docs/Dialects/LLVM/#attribute-pass-through
Reviewed By: awarzynski, aartbik
Differential Revision: https://reviews.llvm.org/D148111
|
|
SM80 flag guards the test for targets that do not support A100 GPUs
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D147863
|
|
Windows"
This reverts commit 5561e174117ff395d65b6978d04b62c1a1275138
The logic was moved from cmake into lit fixing the issue that lead to the revert and potentially others with multi-config cmake generators
Differential Revision: https://reviews.llvm.org/D143925
|
|
on Windows"
This reverts commit 161b9d741a3c25f7bd79620598c5a2acf3f0f377.
REASON:
cmake --build . --target check-mlir-integration
Failed Tests (186):
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-addi-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-cmpi-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-compare-results-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-constants-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-max-min-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-muli-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shli-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shrsi-i16.mlir
MLIR :: Integration/Dialect/Arith/CPU/test-wide-int-emulation-shrui-i16.mlir
MLIR :: Integration/Dialect/Async/CPU/microbench-linalg-async-parallel-for.mlir
MLIR :: Integration/Dialect/Async/CPU/microbench-scf-async-parallel-for.mlir
MLIR :: Integration/Dialect/Async/CPU/test-async-parallel-for-1d.mlir
MLIR :: Integration/Dialect/Async/CPU/test-async-parallel-for-2d.mlir
MLIR :: Integration/Dialect/Complex/CPU/correctness.mlir
MLIR :: Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm-vector.mlir
MLIR :: Integration/Dialect/LLVMIR/CPU/X86/test-inline-asm.mlir
MLIR :: Integration/Dialect/LLVMIR/CPU/test-vector-reductions-fp.mlir
MLIR :: Integration/Dialect/LLVMIR/CPU/test-vector-reductions-int.mlir
MLIR :: Integration/Dialect/Linalg/CPU/matmul-vs-matvec.mlir
MLIR :: Integration/Dialect/Linalg/CPU/rank-reducing-subview.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-collapse-tensor.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-elementwise.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-expand-tensor.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-one-shot-bufferize.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-padtensor.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-subtensor-insert-multiple-uses.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-subtensor-insert.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-tensor-e2e.mlir
MLIR :: Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir
MLIR :: Integration/Dialect/Memref/cast-runtime-verification.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/concatenate.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output_bf16.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/dense_output_f16.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_abs.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_binary.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_codegen_dim.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_codegen_foreach.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex32.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex64.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_complex_ops.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_constant_to_sparse_tensor.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_dyn.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_ptr.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_dot.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_expand.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_file_io.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_flatten.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_foreach_slices.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_index.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_1d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_2d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_insert_3d.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matmul.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matrix_ops.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_mttkrp.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_mult_elt.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_reduction.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_out_simple.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_re_im.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reduce_custom.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reduce_custom_prod.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reductions.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reductions_prod.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_reshape.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_scale.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_scf_nested.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_select.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sign.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sorted_coo.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_storage.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_bf16.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_c32.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_sum_f16.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tanh.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tensor_mul.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_tensor_ops.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_transpose.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir
MLIR :: Integration/Dialect/SparseTensor/CPU/sparse_vector_ops.mlir
MLIR :: Integration/Dialect/SparseTensor/python/test_SDDMM.py
MLIR :: Integration/Dialect/SparseTensor/python/test_SpMM.py
MLIR :: Integration/Dialect/SparseTensor/python/test_elementwise_add_sparse_output.py
MLIR :: Integration/Dialect/SparseTensor/python/test_output.py
MLIR :: Integration/Dialect/SparseTensor/python/test_stress.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_MTTKRP.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_SDDMM.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_SpMM.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_SpMV.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_Tensor.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_scalar_tensor_algebra.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_simple_tensor_algebra.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_complex.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_types.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_tensor_unary_ops.py
MLIR :: Integration/Dialect/SparseTensor/taco/test_true_dense_tensor_algebra.py
MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_core.py
MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_io.py
MLIR :: Integration/Dialect/SparseTensor/taco/unit_test_tensor_utils.py
MLIR :: Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir
MLIR :: Integration/Dialect/Standard/CPU/test_subview.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-mulf-full.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-mulf.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli-ext.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli-full.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-muli.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-tilezero-block.mlir
MLIR :: Integration/Dialect/Vector/CPU/AMX/test-tilezero.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-dot.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-inline-asm-vector-avx512.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-mask-compress.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-rsqrt.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-sparse-dot-product.mlir
MLIR :: Integration/Dialect/Vector/CPU/X86Vector/test-vp2intersect-i32.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-0-d-vectors.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-broadcast.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-compress.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-constant-mask.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-contraction.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-create-mask-v4i1.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-create-mask.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-expand.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-extract-strided-slice.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-flat-transpose-col.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-flat-transpose-row.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-fma.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-gather.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-index-vectors.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-insert-strided-slice.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-maskedload.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-maskedstore.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-matrix-multiply-col.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-matrix-multiply-row.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-outerproduct-f32.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-outerproduct-i64.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-print-int.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-realloc.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f32-reassoc.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f32.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f64-reassoc.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-f64.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i32.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i4.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-i64.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-si4.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-reductions-ui4.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-scan.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-scatter.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-shape-cast.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-shuffle.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-sparse-dot-matvec.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-sparse-saxpy-jagged-matvec.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-1d.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-2d.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read-3d.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-read.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-to-loops.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transfer-write.mlir
MLIR :: Integration/Dialect/Vector/CPU/test-transpose.mlir
Testing Time: 0.29s
Unsupported: 31
Passed : 5
Failed : 186
Differential Revision: https://reviews.llvm.org/D143970
|
|
This patch contains the changes required to make the vast majority of integration and runner tests run on Windows.
Historically speaking, the JIT support for Windows has been lacking behind, but recent versions of ORC JIT have now caught up and works for basically all examples in repo.
Sadly due to these tests previously not working on Windows, basically all of them are making unix-like assumptions about things like filenames, paths, shell syntax etc.
This patch fixes all these issues in one big swoop and enables Windows support for the vast majority of integration tests.
More specifically, following changes had to be done:
* The various JIT runners used paths to the runtime libraries that assumed a Unix toolchain layout and filenames. I abstracted the specific path and filename of these runtime libraries away by making the paths to the runtime libraries be passed from cmake into lit. This now also allows a much more convenient syntax: `--shared-libs=%mlir_c_runner_utils` instead of `--shared-libs=%mlir_lib_dir/lib/libmlir_c_runner_utils%shlibext`
* Some tests using python set environment variables using the `ENV=VALUE cmd` format. This works on Unix, but on Windows it has to prefixed using `env ENV=VALUE cmd`
* Some tests used C functions that are simply not available or exported on Windows (`fabsf`, `aligned_alloc`). These tests have either been adjusted or explicitly marked as `UNSUPPORTED`
Some tests remain disabled on Windows as before:
* In SparseTensor some tests have non-trivial logic for finding the runtime libraries which seems to be required for the use of emulators. I do not have the time to port these so I simply kept them disabled
* Some tests requiring special hardware which I simply cannot test remain disabled on Windows. These include usage of AVX512 or AMX
The tests for `mlir-vulkan-runner` and `mlir-spirv-runner` all work now as well and so do the vast majority of `mlir-cpu-runner`.
Differential Revision: https://reviews.llvm.org/D143925
|
|
This patch adds the initial VP intrinsic integration test on the host backend and RVV emulator. Please see more detailed [discussion on the discourse](https://discourse.llvm.org/t/mlir-vp-ops-on-rvv-backend-integration-test-and-issues-report/66343).
- Run the test cases on the host by configuring the CMake option: `-DMLIR_INCLUDE_INTEGRATION_TESTS=ON`
- Build the RVV environment and run the test cases on RVV QEMU by [this doc](https://gist.github.com/zhanghb97/ad44407e169de298911b8a4235e68497).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137816
|
|
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D136997
|
|
|
|
Replace the following config attributes with `mlir_lib_dir`:
- `mlir_runner_utils_dir`
- `linalg_test_lib_dir`
- `spirv_wrapper_library_dir`
- `vulkan_wrapper_library_dir`
- `mlir_integration_test_dir`
I'm going to clean up substitutions in separate changes.
Reviewed By: aartbik, mehdi_amini
Differential Revision: https://reviews.llvm.org/D133217
|
|
This resubmits commit 0816b62, reverted in commit 328bbab, but without removing the config.target_triple.
Lit checks UNSUPPORTED tags in the input against the config.target_triple (https://llvm.org/docs/TestingGuide.html#constraining-test-execution).
The original commit made the following bots start failing, because unsupported tests were no longer skipped:
- s390x: https://lab.llvm.org/buildbot/#/builders/199/builds/9247
- Windows: https://lab.llvm.org/buildbot/#/builders/13/builds/25321
- Sanitizer: https://lab.llvm.org/buildbot/#/builders/5/builds/27187
|
|
This reverts commit 0816b629c9da5aa8885c4cb3fbbf5c905d37f0ee.
Reason: Broke the sanitizer buildbots. More information available in the
original phabricator review: https://reviews.llvm.org/D132726
|
|
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D132726
|
|
When building in debug mode, the link time of the standalone sample is excessive, taking upwards of a minute if using BFD. This at least allows lld to be used if the main invocation was configured that way. On my machine, this gets a standalone test that requires a relink to run in ~13s for Debug mode. This is still a lot, but better than it was. I think we may want to do something about this test: it adds a lot of latency to a normal compile/test cycle and requires a bunch of arg fiddling to exclude.
I think we may end up wanting a `check-mlir-heavy` target that can be used just prior to submit, and then make `check-mlir` just run unit/lite tests. More just thoughts for the future (none of that is done here).
Reviewed By: bondhugula, mehdi_amini
Differential Revision: https://reviews.llvm.org/D126585
|
|
In order to run these integration tests, it is required access to an
SVE-enabled CPU or and emulator with SVE support. In case of using
an emulator, aarch64 versions of lli and the MLIR C Runner Utils Library
are also required.
Differential Revision: https://reviews.llvm.org/D104517
|
|
This mechanically applies the same changes from D121427 everywhere.
Differential Revision: https://reviews.llvm.org/D121746
|
|
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.
Differential Revision: https://reviews.llvm.org/D119918
|
|
Compiling code for AMD GPUs requires knowledge of which chipset is
being targeted, especially if the code uses chipset-specific
intrinsics (which is the case in a downstream convolution generator).
This commit adds `target`, `chipset` and `features` arguments to the
SerializeToHsaco constructor to enable passing in this required
information.
It also amends the ROCm integration tests to pass in the target
chipset, which is set to the chipset of the first GPU on the system
executing the tests.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114107
|
|
Re-applies D111513:
* Adds a full-fledged Python example dialect and tests to the Standalone example (need to do a bit of tweaking in the top level CMake and lit tests to adapt better to if not building with Python enabled).
* Rips out remnants of custom extension building in favor of pybind11_add_module which does the right thing.
* Makes python and extension sources installable (outputs to src/python/${name} in the install tree): Both Python and C++ extension sources get installed as downstreams need all of this in order to build a derived version of the API.
* Exports sources targets (with our properties that make everything work) by converting them to INTERFACE libraries (which have export support), as recommended for the forseeable future by CMake devs. Renames custom properties to start with lower-case letter, as also recommended/required (groan).
* Adds a ROOT_DIR argument to declare_mlir_python_extension since now all C++ sources for an extension must be under the same directory (to line up at install time).
* Downstreams will need to adapt by:
* Remove absolute paths from any SOURCES for declare_mlir_python_extension (I believe all downstreams are just using ${CMAKE_CURRENT_SOURCE_DIR} here, which can just be ommitted). May need to set ROOT_DIR if not relative to the current source directory.
* To allow further downstreams to install/build, will need to make sure that all C++ extension headers are also listed under SOURCES for declare_mlir_python_extension.
This reverts commit 1a6c26d1f52999edbfbf6a978ae3f0e6759ea755.
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D113732
|
|
* Call `llvm_canonicalize_cmake_booleans` for all CMake options,
which are propagated to `lit.local.cfg` files.
* Use Python native boolean values instead of strings for such options.
This fixes the cases, when CMake variables have values other than `ON` (like `TRUE`).
This might happen due to IDE integration or due to CMake preset usage.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D110073
|
|
When LLVM and MLIR are built as subprojects (via add_subdirectory),
the CMake configuration that indicates where the MLIR libraries are is
not necessarily in the same cmake/ directory as LLVM's configuration.
This patch removes that assumption about where MLIRConfig.cmake is
located.
(As an additional none, the %llvm_lib_dir substitution was never
defined, and so find_package(MLIR) in the build was succeeding for
other reasons.)
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D103276
|
|
Fix inconsistent MLIR CMake variable names. Consistently name them as
MLIR_ENABLE_<feature>.
Eg: MLIR_CUDA_RUNNER_ENABLED -> MLIR_ENABLE_CUDA_RUNNER
MLIR follows (or has mostly followed) the convention of naming
cmake enabling variables in the from MLIR_ENABLE_... etc. Using a
convention here is easy and also important for convenience. A counter
pattern was started with variables named MLIR_..._ENABLED. This led to a
sequence of related counter patterns: MLIR_CUDA_RUNNER_ENABLED,
MLIR_ROCM_RUNNER_ENABLED, etc.. From a naming standpoint, the imperative
form is more meaningful. Additional discussion at:
https://llvm.discourse.group/t/mlir-cmake-enable-variable-naming-convention/3520
Switch all inconsistent ones to the ENABLE form. Keep the couple of old
mappings needed until buildbot config is migrated.
Differential Revision: https://reviews.llvm.org/D102976
|
|
Add a test case to test the complete execution of WMMA ops on a Nvidia
GPU with tensor cores. These tests are enabled under
MLIR_RUN_CUDA_TENSOR_CORE_TESTS.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D95334
|
|
We will soon be adding non-AVX512 operations to MLIR, such as AVX's rsqrt. In https://reviews.llvm.org/D99818 several possibilities were discussed, namely to (1) add non-AVX512 ops to the AVX512 dialect, (2) add more dialects (e.g. AVX dialect for AVX rsqrt), and (3) expand the scope of the AVX512 to include these SIMD x86 ops, thereby renaming the dialect to something more accurate such as X86Vector.
Consensus was reached on option (3), which this patch implements.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100119
|
|
This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.
I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D98447
|
|
The Intel Advanced Matrix Extensions (AMX) provides a tile matrix
multiply unit (TMUL), a tile control register (TILECFG), and eight
tile registers TMM0 through TMM7 (TILEDATA). This new MLIR dialect
provides a bridge between MLIR concepts like vectors and memrefs
and the lower level LLVM IR details of AMX.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D98470
|
|
This allows to build and test MLIR with `-DLLVM_ENABLE_LIBCXX=ON`.
|
|
Move test inputs to test/Integration directory.
Move runtime wrappers to ExecutionEngine.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D97463
|
|
This does not change the behavior directly: the tests only run when
`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running
`ninja check-mlir` will not run all the tests within a single
lit invocation. The previous behavior would wait for all the integration
tests to complete before starting to run the first regular test. The
test results were also reported separately. This change is unifying all
of this and allow concurrent execution of the integration tests with
regular non-regression and unit-tests.
Differential Revision: https://reviews.llvm.org/D97241
|
|
Multi-configuration generators (such as Visual Studio and Xcode) allow the specification of a build flavor at build time instead of config time, so the lit configuration files need to support that - and they do for the most part. There are several places that had one of two issues (or both!):
1) Paths had %(build_mode)s set up, but then not configured, resulting in values that would not work correctly e.g. D:/llvm-build/%(build_mode)s/bin/dsymutil.exe
2) Paths did not have %(build_mode)s set up, but instead contained $(Configuration) (which is the value for Visual Studio at configuration time, for Xcode they would have had the equivalent) e.g. "D:/llvm-build/$(Configuration)/lib".
This seems to indicate that we still have a lot of fragility in the configurations, but also that a number of these paths are never used (at least on Windows) since the errors appear to have been there a while.
This patch fixes the configurations and it has been tested with Ninja and Visual Studio to generate the correct paths. We should consider removing some of these settings altogether.
Reviewed By: JDevlieghere, mehdi_amini
Differential Revision: https://reviews.llvm.org/D96427
|
|
07f1047f41d changed the CMake detection to use find_package(Python3 ...
but didn't update the lit configuration to use the expected Python3_EXECUTABLE
cmake variable to point to the interpreter path.
This resulted in an empty path on MacOS.
|
|
This patch introduces a SPIR-V runner. The aim is to run a gpu
kernel on a CPU via GPU -> SPIRV -> LLVM conversions. This is a first
prototype, so more features will be added in due time.
- Overview
The runner follows similar flow as the other runners in-tree. However,
having converted the kernel to SPIR-V, we encode the bind attributes of
global variables that represent kernel arguments. Then SPIR-V module is
converted to LLVM. On the host side, we emulate passing the data to device
by creating in main module globals with the same symbolic name as in kernel
module. These global variables are later linked with ones from the nested
module. We copy data from kernel arguments to globals, call the kernel
function from nested module and then copy the data back.
- Current state
At the moment, the runner is capable of running 2 modules, nested one in
another. The kernel module must contain exactly one kernel function. Also,
the runner supports rank 1 integer memref types as arguments (to be scaled).
- Enhancement of JitRunner and ExecutionEngine
To translate nested modules to LLVM IR, JitRunner and ExecutionEngine were
altered to take an optional (default to `nullptr`) function reference that
is a custom LLVM IR module builder. This allows to customize LLVM IR module
creation from MLIR modules.
Reviewed By: ftynse, mravishankar
Differential Revision: https://reviews.llvm.org/D86108
|
|
Summary:
* Native '_mlir' extension module.
* Python mlir/__init__.py trampoline module.
* Lit test that checks a message.
* Uses some cmake configurations that have worked for me in the past but likely needs further elaboration.
Subscribers: mgorny, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, Kayjukh, jurahul, msifontes
Tags: #mlir
Differential Revision: https://reviews.llvm.org/D83279
|
|
Summary:
`mlir-rocm-runner` is introduced in this commit to execute GPU modules on ROCm
platform. A small wrapper to encapsulate ROCm's HIP runtime API is also inside
the commit.
Due to behavior of ROCm, raw pointers inside memrefs passed to `gpu.launch`
must be modified on the host side to properly capture the pointer values
addressable on the GPU.
LLVM MC is used to assemble AMD GCN ISA coming out from
`ConvertGPUKernelToBlobPass` to binary form, and LLD is used to produce a shared
ELF object which could be loaded by ROCm HIP runtime.
gfx900 is the default target be used right now, although it could be altered via
an option in `mlir-rocm-runner`. Future revisions may consider using ROCm Agent
Enumerator to detect the right target on the system.
Notice AMDGPU Code Object V2 is used in this revision. Future enhancements may
upgrade to AMDGPU Code Object V3.
Bitcode libraries in ROCm-Device-Libs, which implements math routines exposed in
`rocdl` dialect are not yet linked, and is left as a TODO in the logic.
Reviewers: herhut
Subscribers: mgorny, tpr, dexonsmith, mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, aartbik, liufengdb, stephenneuendorffer, Joonsoo, grosul1, frgossen, Kayjukh, jurahul, llvm-commits
Tags: #mlir, #llvm
Differential Revision: https://reviews.llvm.org/D80676
|
|
Make ConvertKernelFuncToCubin pass to be generic:
- Rename to ConvertKernelFuncToBlob.
- Allow specifying triple, target chip, target features.
- Initializing LLVM backend is supplied by a callback function.
- Lowering process from MLIR module to LLVM module is via another callback.
- Change mlir-cuda-runner to adopt the revised pass.
- Add new tests for lowering to ROCm HSA code object (HSACO).
- Tests for CUDA and ROCm are kept in separate directories.
Differential Revision: https://reviews.llvm.org/D80142
|
|
This attempts to ensure that out of tree usage remains stable.
Differential Revision: https://reviews.llvm.org/D78656
|
|
Add utilities print_flops, rtclock for timing / benchmarking. Add
mlir_runner_utils_dir test conf variable.
Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
Differential Revision: https://reviews.llvm.org/D76912
|
|
Add an initial version of mlir-vulkan-runner execution driver.
A command line utility that executes a MLIR file on the Vulkan by
translating MLIR GPU module to SPIR-V and host part to LLVM IR before
JIT-compiling and executing the latter.
Differential Revision: https://reviews.llvm.org/D72696
|
|
PiperOrigin-RevId: 282574110
|
|
Moving cuda-runtime-wrappers.so into subdirectory to match libmlir_runner_utils.so.
Provide parent directory when running test and load .so from subdirectory.
PiperOrigin-RevId: 282410749
|
|
ldflags can contain double-quoted paths, so must use single quotes here.
PiperOrigin-RevId: 274581983
|
|
PiperOrigin-RevId: 264277760
|
|
This tool allows to execute MLIR IR snippets written in the GPU dialect
on a CUDA capable GPU. For this to work, a working CUDA install is required
and the build has to be configured with MLIR_CUDA_RUNNER_ENABLED set to 1.
PiperOrigin-RevId: 256551415
|
|
The actual transformation from PTX source to a CUDA binary is now factored out,
enabling compiling and testing the transformations independently of a CUDA
runtime.
MLIR has still to be built with NVPTX target support for the conversions to be
built and tested.
PiperOrigin-RevId: 255167139
|