diff options
author | Kajetan Puchalski <kajetan.puchalski@arm.com> | 2025-05-22 15:11:46 +0100 |
---|---|---|
committer | GitHub <noreply@github.com> | 2025-05-22 15:11:46 +0100 |
commit | c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4 (patch) | |
tree | d04642f98974870184d046d639433f544b3798a8 /llvm/lib/CodeGen/MachineBasicBlock.cpp | |
parent | 6375a8508e836a49ffcee306b57166a39a950afe (diff) | |
download | llvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.zip llvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.tar.gz llvm-c2892b0bdfb34bd4a79f357ee2f234a29f9e49f4.tar.bz2 |
[flang-rt] Optimise ShallowCopy and use it in CopyInAssign (#140569)
Using Descriptor.Element<>() when iterating through a rank-1 array is
currently inefficient, because the generic implementation suitable for
arrays of any rank makes the compiler unable to perform optimisations
that would make the rank-1 case considerably faster.
This is currently done inside ShallowCopy, as well as by CopyInAssign,
where the implementation of elemental copies (inside Assign) is
equivalent to ShallowCopyDiscontiguousToDiscontiguous.
To address that, add a DescriptorIterator abstraction specialised for
arrays of various ranks, and use that throughout ShallowCopy to iterate
over the arrays.
Furthermore, depending on the pointer type passed to memcpy, the
optimiser can remove the memcpy calls from ShallowCopy altogether which
can result in substantial performance improvements on its own.
Specialise ShallowCopy for various element pointer types to make these
optimisations possible.
Finally, replace the call to Assign inside CopyInAssign with a call to
newly optimised ShallowCopy.
For the thornado-mini application, this reduces the runtime by 27.7%.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Diffstat (limited to 'llvm/lib/CodeGen/MachineBasicBlock.cpp')
0 files changed, 0 insertions, 0 deletions