aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTamar Christina <tamar.christina@arm.com>2024-12-18 09:02:46 +0000
committerTamar Christina <tamar.christina@arm.com>2024-12-18 09:03:37 +0000
commit18aff7644ad1e44dc146d36a2b7e397977aa47ac (patch)
tree4a7b4f340f646c011a97f89c01a2d2eb5059eb40
parentc5424185b0c3652086efc914fa1e0c83365f6072 (diff)
downloadgcc-18aff7644ad1e44dc146d36a2b7e397977aa47ac.zip
gcc-18aff7644ad1e44dc146d36a2b7e397977aa47ac.tar.gz
gcc-18aff7644ad1e44dc146d36a2b7e397977aa47ac.tar.bz2
libstdc++: Add inline keyword to _M_locate
In GCC 12 there was a ~40% regression in the performance of hashmap->find. This regression came about accidentally: Before GCC 12 the find function was small enough that IPA would inline it even though it wasn't marked inline. In GCC-12 an optimization was added to perform a linear search when the entries in the hashmap are small. This increased the size of the function enough that IPA would no longer inline. Inlining had two benefits: 1. The return value is a reference. so it has to be returned and dereferenced even though the search loop may have already dereference it. 2. The pattern is a hard pattern to track for branch predictors. This causes a large number of branch misses if the value is immediately checked and branched on. i.e. if (a != m.end()) which is a common pattern. The patch fixes both these issues by adding the inline keyword to _M_locate to allow the inliner to consider inlining again. This and the other patches have been ran through serveral benchmarks where the size, number of elements searched for and type (reference vs value) etc were tested. The change shows no statistical regression, but an average find improvement of ~27% and a range between ~10-60% improvements. A selection of the results: +-----------+--------------------+-------+----------+ | Group | Benchmark | Size | % Inline | +-----------+--------------------+-------+----------+ | Find | unord<uint64_t | 11274 | 53.52% | | Find | unord<uint64_t | 11254 | 47.98% | | Find Mult | unord<uint64_t | 12 | 47.62% | | Find Mult | unord<std::string | 12 | 44.94% | | Find Mult | unord<std::string | 10 | 44.89% | | Find Mult | unord<uint64_t | 11 | 40.90% | | Find Mult | unord<uint64_t | 352 | 30.57% | | Find | unord<uint64_t | 351 | 28.27% | | Find Mult | unord<uint64_t | 342 | 26.80% | | Find | unord<std::string | 12 | 25.66% | | Find Mult | unord<std::string | 352 | 23.12% | | Find | unord<std::string | 13 | 20.36% | | Find Mult | unord<std::string | 355 | 19.23% | | Find | unord<std::string | 353 | 18.59% | | Find | unord<uint64_t | 350 | 15.43% | | Find | unord<std::string | 11260 | 11.80% | | Find | unord<std::string | 352 | 11.12% | | Find | unord<std::string | 11262 | 9.97% | +-----------+--------------------+-------+----------+ libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Inline _M_locate.
-rw-r--r--libstdc++-v3/include/bits/hashtable.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 2dc2498..cd60fb5 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -2219,7 +2219,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
typename _ExtractKey, typename _Equal,
typename _Hash, typename _RangeHash, typename _Unused,
typename _RehashPolicy, typename _Traits>
- auto
+ inline auto
_Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
_Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
_M_locate(const key_type& __k) const