aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
diff options
context:
space:
mode:
authorFraser Cormack <fraser@codeplay.com>2022-05-16 10:57:32 +0100
committerFraser Cormack <fraser@codeplay.com>2022-05-24 06:53:51 +0100
commitcb8681a2b3ad406f1215a17fd32989211c561e3e (patch)
tree8e09beb8723933f65e882f0acdcdf881b12d0f8c /llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
parent1868f3c17db971bde79f54530aa8686f04bff287 (diff)
downloadllvm-cb8681a2b3ad406f1215a17fd32989211c561e3e.zip
llvm-cb8681a2b3ad406f1215a17fd32989211c561e3e.tar.gz
llvm-cb8681a2b3ad406f1215a17fd32989211c561e3e.tar.bz2
[RISCV] Fix RVV stack frame alignment bugs
This patch addresses several alignment issues in the stack frame when RVV objects are taken into account. One bug is that the RVV stack was never guaranteed to keep the alignment of the stack *as a whole*. We must maintain a 16-byte aligned stack at all times, especially when calling other functions. With the standard V extension, this is conveniently happening since VLEN is at least 128 and always 16-byte aligned. However, we support Zvl64b which does not guarantee this. To fix this, the RVV stack size is rounded up to be aligned to 16 bytes. This in practice generally makes us allocate a stack sized at least 2*VLEN in size, and a multiple of 2. |------------------------------| -- <-- FP | 8-byte callee-save | | | |------------------------------| | | | one VLENB-sized RVV object | | | |------------------------------| | | | 8-byte local variable | | | |------------------------------| -- <-- SP (must be aligned to 16) In the example above, with Zvl64b we are decrementing SP by 12 bytes which does not leave SP correctly aligned. We therefore introduce an extra VLENB-sized amount used for alignment. This would therefore ensure the total stack size was 16 bytes (48 for Zvl128b, 80 for Zvl256b, etc): |------------------------------| -- <-- FP | 8-byte callee-save | | | |------------------------------| | | | one VLENB-sized padding obj | | | | one VLENB-sized RVV object | | | |------------------------------| | | | 8-byte local variable | | | |------------------------------| -- <-- SP A new RVV invariant has been introduced in this patch, which is that the base of the RVV stack itself is now always aligned to 16 bytes, not 8 as before. This keeps us more in line with the scalar stack and should be easier to reason about. The calculation of the RVV padding has thus changed to be the amount required to align the scalar local variable section to the RVV section's alignment. This amount is further rounded up when setting up the initial stack to keep everything aligned: |------------------------------| -- <-- FP | 8-byte callee-save | |------------------------------| | | | RVV objects | | (aligned to at least 16) | | | |------------------------------| | RVV padding of 8 bytes | |------------------------------| | 8-byte local variable | |------------------------------| -- <-- SP In the example above, it's clear that we need 8 bytes of padding to keep the RVV section aligned to 16 when using SP. But to keep SP *itself* aligned to 16 we can't decrement the initial stack pointer by 24 - we have to round up to 32. With the RVV section correctly aligned, the second bug fixed by this patch is that RVV objects themselves are now correctly aligned. We were previously only guaranteeing an alignment of 8 bytes, even if they required a higher alignment. This is relatively simple and in practice we see more rounding up of VLEN amounts to account for alignment in between objects: |------------------------------| | RVV object (aligned to 16) | |------------------------------| | no padding necessary | |------------------------------| | 2*VLENB RVV object (align 16)| |------------------------------| | VLENB alignment padding | |------------------------------| | RVV object (align 32) | |------------------------------| | 3*VLENB alignment padding | |------------------------------| | VLENB RVV object (align 32) | |------------------------------| -- <-- base of RVV section Note that a lot of the regressions in codegen owing to the new alignment rules are correct but actually only strictly necessary for Zvl64b (and Zvl32b but that's not really supported). I plan a follow-up patch to take the known VLEN into account when padding for alignment. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D125787
Diffstat (limited to 'llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp')
0 files changed, 0 insertions, 0 deletions