aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
diff options
context:
space:
mode:
authorKrzysztof Drewniak <Krzysztof.Drewniak@amd.com>2024-11-18 13:41:54 -0800
committerGitHub <noreply@github.com>2024-11-18 15:41:54 -0600
commit31aa7f34e07c901773993dac0f33568307f96da6 (patch)
treefebd0bd4bb5b802a22b00beabf041933b96e929b /llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
parentf8d1905a24c16bf6db42d428672401156ef6a473 (diff)
downloadllvm-31aa7f34e07c901773993dac0f33568307f96da6.zip
llvm-31aa7f34e07c901773993dac0f33568307f96da6.tar.gz
llvm-31aa7f34e07c901773993dac0f33568307f96da6.tar.bz2
[mlir][Affine] Let affine.[de]linearize_index omit outer bounds (#116103)
The affine.delinearize_index and affine.linearize_index operations, as currently defined, require providing a length N basis to [de]linearize N values. The first value in this basis is never used during lowering and is unused during lowering. (Note that, even though it isn't used during lowering it can still be used to, for example, remove length-1 outputs from a delinearize). This dead value makes sense in the original context of these operations, which is linearizing or de-linearizing indexes to memref<>s, vector<>s, and other shaped types, where that outer bound is avaliable and may be useful for analysis. However, other usecases exist where the outer bound is not known. For example: %thread_id_x = gpu.thread_id x : index %0:3 = affine.delinearize_index %thread_id_x into (4, 16) : index,index, index In this code, we don't know the upper bound of the thread ID, but we do want to construct the ?x4x16 grid of delinearized values in order to further partition the GPU threads. In order to support such usecases, we broaden the definition of affine.delinearize_index and affine.linearize_index to make the outer bound optional. In the case of affine.delinearize_index, where the number of results is a function of the size of the passed-in basis, we augment all existing builders with a `hasOuterBound` argument, which, for backwards compatibilty and to preserve the natural usage of the op, defaults to `true`. If this flag is true, the op returns one result per basis element, if it is false, it returns one extra result in position 0. We also update existing canonicalization patterns (and move one of them into the folder) to handle these cases. Note that disagreements about the outer bound now no longer prevent delinearize/linearize cancelations.
Diffstat (limited to 'llvm/lib/Bitcode/Writer/BitcodeWriter.cpp')
0 files changed, 0 insertions, 0 deletions