aboutsummaryrefslogtreecommitdiff
path: root/llvm/tools/llvm-readobj/llvm-readobj.cpp
diff options
context:
space:
mode:
authorMomchil Velikov <momchil.velikov@arm.com>2022-12-14 14:55:59 +0000
committerMomchil Velikov <momchil.velikov@arm.com>2022-12-14 15:30:32 +0000
commite6b9fc4c8be00660c5e1f1605a6a5d92475bdba7 (patch)
treedf05bb677253a70edbc4fe1f90e28767155161bf /llvm/tools/llvm-readobj/llvm-readobj.cpp
parent3a43e68ed1cbd96bc70359f0d9dced517195ae9c (diff)
downloadllvm-e6b9fc4c8be00660c5e1f1605a6a5d92475bdba7.zip
llvm-e6b9fc4c8be00660c5e1f1605a6a5d92475bdba7.tar.gz
llvm-e6b9fc4c8be00660c5e1f1605a6a5d92475bdba7.tar.bz2
[FuncSpec] Global ranking of specialisations
The `FunctionSpecialization` pass chooses specializations among the opportunities presented by a single function and its calls, progressively penalizing subsequent specialization attempts by artificially increasing the cost of a specialization, depending on how many specialization were applied before. Thus the chosen specializations are sensitive to the order the functions appear in the module and may be worse than others, had those others been considered earlier. This patch makes the `FunctionSpecialization` pass rank the specializations globally, i.e. choose the "best" specializations among the all possible specializations in the module, for all functions. Since this involved quite a bit of redesign of the pass data structures, this patch also carries: * removal of duplicate specializations * optimization of call sites update, by collecting per specialization the list of call sites that can be directly rewritten, without prior expensive check if the call constants and their positions match those of the specialized function. A bit of a write-up up about the FuncSpec data structures and operation: Each potential function specialisation is kept in a single vector (`AllSpecs` in `FunctionSpecializer::run`). This vector is populated by `FunctionSpecializer::findSpecializations`. The `findSpecializations` member function has a local `DenseMap` to eliminate duplicates - with each call to the current function, `findSpecializations` builds a specialisation signature (`SpecSig`) and looks it in the duplicates map. If the signature is present, the function records the call to rewrite into the existing specialisation instance. If the signature is absent, it means we have a new specialisation instance - the function calculates the gain and creates a new entry in `AllSpecs`. Negative gain specialisation are ignored at this point, unless forced. The potential specialisations for a function form a contiguous range in the `AllSpecs` [1]. This range is recorded in `SpecMap SM`, so we can quickly find all specialisations for a function. Once we have all the potential specialisations with their gains we need to choose the best ones, which fit in the module specialisation budget. This is done by using a max-heap (`std::make_heap`, `std::push_heap`, etc) to find the best `NSpec` specialisations with a single traversal of the `AllSpecs` vector. The heap itself is contained with a small vector (`BestSpecs`) of indices into `AllSpecs`, since elements of `AllSpecs` are a bit too heavy to shuffle around. Next the chosen specialisation are performed, that is, functions cloned, `SCCPSolver` primed, and known call sites updated. Then we run the `SCCPSolver` to propagate constants in the cloned functions, after which we walk the calls of the original functions to update them to call the specialised functions. --- [1] This range may contain specialisation that were discarded and is not ordered in any way. One alternative design is to keep a vector indices of all specialisations for this function (which would initially be, `i`, `i+1`, `i+2`, etc) and later sort them by gain, pushing non-applied ones to the back. This has the potential to speed `updateCallSites` up. Reviewed By: ChuanqiXu, labrinea Differential Revision: https://reviews.llvm.org/D139346 Change-Id: I708851eb38f07c42066637085b833ca91b195998
Diffstat (limited to 'llvm/tools/llvm-readobj/llvm-readobj.cpp')
0 files changed, 0 insertions, 0 deletions