aboutsummaryrefslogtreecommitdiff
path: root/jim_tcl.txt
diff options
context:
space:
mode:
authorSteve Bennett <steveb@workware.net.au>2020-06-06 15:06:34 +1000
committerSteve Bennett <steveb@workware.net.au>2020-06-25 09:01:02 +1000
commit99b7ce979f72087be74b039aa988a4f25f3b6d27 (patch)
tree6b23f609b5309e3eea97b49e0eaf729f423a1766 /jim_tcl.txt
parent015569f090acda5702eafe6487b534fa76d354d5 (diff)
downloadjimtcl-99b7ce979f72087be74b039aa988a4f25f3b6d27.zip
jimtcl-99b7ce979f72087be74b039aa988a4f25f3b6d27.tar.gz
jimtcl-99b7ce979f72087be74b039aa988a4f25f3b6d27.tar.bz2
core: dicts (and arrays) now preserve insertion order
Although the documentation has always stated that, like Tcl, insertion order of dictionaries was preserved, this has never been the case. Instead, dictionaries were implemented as simple hash tables that did not preserve order. Now, a new implementation of dictionaries preserves insertion order and has a number of other benefits. Instead of hashing keys and storing keys and values in the hash table, the keys and values are not stored in a separate table, exactly as lists are stored, with alternating key, value pairs. Iterating over the dictionary is exactly like iterating over a list, where the order is consistent. The hash table uses closed hashing rather than open hashing to avoid allocatation of hash entry structures. Instead a fixed (but expandable) hash table maps the key hash to the offset in the key/value table. This use of offsets means that if the key/value table grows, the offsets remain valid. Likewise, if the hash table needs to grow, the key, value table remains unchanged. In addition to the offset (which indexes to the value, and 0 means the hash table entry is unused), the original hash is stored in the hash table. This reduces the need for object comparisons on hash entry collisions. The collision resolution algorithm is the same as that used by Python: peturb >>= 5; idx = (5 * idx + 1 + peturb) & dict->sizemask; In order to reduce collisions, the hash table is expanded once it reaches half full. This is more conservative that Python where the table is expanded when it is two thirds full. Note that if a hash collision occurs and then the original entry that cased the hash collision is removed, we still need to continue iterating when searching for the new key. Don't stop at the now-empty slot. So these entries are marked with offset=-1 to indicate that they need to be skipped. In addition, the new faster hashing algorithm from Tcl 8.7 is used. This the hash for integers to be calculated efficiently without requiring them to be converted to string form first. This implementation is modelled largely on the Python dict implementation. Overall the performance should be an improvement over the current implementation, whilst preserving order. Dictionary creating and insertion should be faster as hash entries do not need to be allocated and resizing should be slightly faster. Entry lookup should be about the same, except may be faster for pure integer keys. Below are some indicative benchmarks. OLD NEW dict-create-1.1 Create empty dict 97.2ns . dict-create-1.2 Create small dict 440ns -27% dict-create-1.3 Create medium dict 1.54us -57% dict-create-1.4 Create large dict (int keys) 130us -80% dict-create-1.5 Create large dict (string keys) 143us -75% dict-set-1.1 Replace existing item 258ns -34% dict-set-1.2 Replace nonexistent item 365ns -49% dict-exists-1.1 Find existing item 55.7ns -5% dict-exists-1.2 Find nonexistent item 55.0ns -5% Signed-off-by: Steve Bennett <steveb@workware.net.au>
Diffstat (limited to 'jim_tcl.txt')
-rw-r--r--jim_tcl.txt1
1 files changed, 1 insertions, 0 deletions
diff --git a/jim_tcl.txt b/jim_tcl.txt
index 25e2130..2a60a3c 100644
--- a/jim_tcl.txt
+++ b/jim_tcl.txt
@@ -64,6 +64,7 @@ Changes between 0.78 and 0.79
8. `regsub` now fully supports +{backslash}A+
9. Add `socket pty` to create a pseudo-tty pair
10. Null characters (\x00) are now supported in variable and proc names
+11. dictionaries and arrays now preserve insertion order, matching Tcl and the documentation
Changes between 0.77 and 0.78
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~