dict: consider dummy entries when determining when to expand hash table

This issue was caused by the fix in 24b234543c7322d2dd20339b45367fa3f4c53495 Because we used closed hashing for the dict hash table, it is important that the table doesn't get too full, as it gets very inefficient due to hash collisions. When allowing for space, also consider dummy entries that consume slots. If there are too many dummy slots, the hash table is expanded, clearing out the dummy slots. Without this fix it is possible to get unlucky when filling a dictionary and emptying it again (twice) will result all slots become dummy slots and thus no more entries can be added. See REGTEST 54 Signed-off-by: Steve Bennett <steveb@workware.net.au>
author: Steve Bennett <steveb@workware.net.au> 2021-12-16 07:29:04 +1000
committer: Steve Bennett <steveb@workware.net.au> 2021-12-16 08:23:50 +1000
commit: 70bdc32d6e4dc1b9239d062cf9e4e70f605f0679 (patch)
tree: 95d347fa8a8b3b6e6bec0f73f29e163d03e4187a /jim.c
parent: 7a518896462aef645e278e7191bb5af12f0fc25a (diff)
download: jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.zip
jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.tar.gz
jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.tar.bz2
1 files changed, 7 insertions, 1 deletions
diff --git a/jim.c b/jim.c
index d01b147..8ae6f35 100644
--- a/jim.c
+++ b/jim.c
@@ -7402,6 +7402,7 @@ static int JimDictHashFind(Jim_Dict *dict, Jim_Obj *keyObjPtr, int op_tvoffset)
             if (tvoffset) {
                 /* Found, remove with -1 meaning a removed entry */
                 dict->ht[idx].offset = -1;
+                dict->dummy++;
             }
             /* else if not found, return 0 */
             break;
@@ -7410,6 +7411,7 @@ static int JimDictHashFind(Jim_Dict *dict, Jim_Obj *keyObjPtr, int op_tvoffset)
                 /* Not found so add it at the the first removed entry, or the end */
                 if (first_removed != ~0) {
                     idx = first_removed;
+                    dict->dummy--;
                 }
                 dict->ht[idx].offset = dict->len + 1;
                 dict->ht[idx].hash = h;
@@ -7475,8 +7477,12 @@ static int JimDictAdd(Jim_Dict *dict, Jim_Obj *keyObjPtr)
      * do nothing.
      * This way we don't need to recalculate the hash index in case
      * it didn't exist and is added.
+     *
+     * Note that dict->len includes both keys and values, so
+     * a dict with 4 keys needs at least 8 entries.
+     * Also dummy entries take up space, so take those into account too.
      */
-    if (dict->size <= dict->len) {
+    if (dict->size <= dict->len + dict->dummy) {
         /* The first add grows the size to 8, and thereafter it is doubled
          * in size. Note that hash table sizes are always powers of two.
          */
author	Steve Bennett <steveb@workware.net.au>	2021-12-16 07:29:04 +1000
committer	Steve Bennett <steveb@workware.net.au>	2021-12-16 08:23:50 +1000
commit	70bdc32d6e4dc1b9239d062cf9e4e70f605f0679 (patch)
tree	95d347fa8a8b3b6e6bec0f73f29e163d03e4187a /jim.c
parent	7a518896462aef645e278e7191bb5af12f0fc25a (diff)
download	jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.zip jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.tar.gz jimtcl-70bdc32d6e4dc1b9239d062cf9e4e70f605f0679.tar.bz2