aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/CodeGen/MachineBasicBlock.cpp
diff options
context:
space:
mode:
authorKajetan Puchalski <kajetan.puchalski@arm.com>2025-05-22 15:11:46 +0100
committerGitHub <noreply@github.com>2025-05-22 15:11:46 +0100
commitc2892b0bdfb34bd4a79f357ee2f234a29f9e49f4 (patch)
treed04642f98974870184d046d639433f544b3798a8 /llvm/lib/CodeGen/MachineBasicBlock.cpp
parent6375a8508e836a49ffcee306b57166a39a950afe (diff)
downloadllvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.zip
llvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.tar.gz
llvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.tar.bz2
[flang-rt] Optimise ShallowCopy and use it in CopyInAssign (#140569)
Using Descriptor.Element<>() when iterating through a rank-1 array is currently inefficient, because the generic implementation suitable for arrays of any rank makes the compiler unable to perform optimisations that would make the rank-1 case considerably faster. This is currently done inside ShallowCopy, as well as by CopyInAssign, where the implementation of elemental copies (inside Assign) is equivalent to ShallowCopyDiscontiguousToDiscontiguous. To address that, add a DescriptorIterator abstraction specialised for arrays of various ranks, and use that throughout ShallowCopy to iterate over the arrays. Furthermore, depending on the pointer type passed to memcpy, the optimiser can remove the memcpy calls from ShallowCopy altogether which can result in substantial performance improvements on its own. Specialise ShallowCopy for various element pointer types to make these optimisations possible. Finally, replace the call to Assign inside CopyInAssign with a call to newly optimised ShallowCopy. For the thornado-mini application, this reduces the runtime by 27.7%. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Diffstat (limited to 'llvm/lib/CodeGen/MachineBasicBlock.cpp')
0 files changed, 0 insertions, 0 deletions