Overview     Usage     Performance Notes     Links tst 0.68 - 23 Jan 2009

Performance Notes

Space considerations

Ternary trees are notably larger than hash maps or most binary tree designs. Each node holds only one character (instead of a whole key), and use 3-5 pointers. Our nodes consist of 4 size_t values (16 bytes) regardless of platform pointer size, or char type (if at most 2-byte like wchar_t).

The shared parts of strings save space: In a typical English dictionary, each key shares over half its nodes with other keys, so the allocated space is about half of total key-length times 16. In a scrabble dictionary like the one reported below, which contains all valid word endings, most nodes are shared, so its storage cost is "only" total key length times 0.35 times 16, or less than 6 bytes per char. With wchar_t type, the storage cost cannot be considered overly large.

See also Implementation Details

Lookup speed

The complexity of ternary tree operations is basically the same as for binary trees, (logarithmic in tree size) but with quite different constant factors. See [1].

Overall lookup and iteration speed depends on application factors - ie whether strings are inserted in random order or not, etc.

Rough speed estimates (compared to Stlport hash_map and map).

Compared to C versions (DDJ and libtst),

Since each character in a key is at a separate node in the internal tree, iterating over values is a little slower than for other tree-based containers.

For detailed test, see performance table below.


Rough performance measurements on 150,000+ words English dictionary.

Lookup, insert and erase figures are normalized on libtst successful-find operation on sorted-insertion tree.
Iteration is normalized on hash_map::iterator.

Average string length was 8.3 chars. The TST trees used 3.12 nodes per string, so prefix sharing makes these strings use 0.37 nodes (just under 6 bytes) per char. All in all 1242 k characters were stored in 7.3 Mb (on 32-bit architectures).
The type of keys used affects performance, though less than insertion sort order. Some reference runs with 10-80k words were made, to verify performance tendencies.
Function find success find failed insert erase iteration
DDJ sort: 0.96
rand: 2.9
sort: 0.32
rand: 0.42
sort: 2.75
rand: 6.4
2.74 (N/A)
libtst sort: 1.0
rand: 2.6
sort: 0.18
rand: 0.18
sort: 1.6
rand: 3.4
0.5 (N/A)
ternary_tree 0.67 sort: 2.0
rand: 3.7
sort: 0.8
rand: 0.77
sort: 3.3
rand: 4.67
0.9 sort: 1.2
rand: 2.1
stlport::hash_map 2.1 1.2 sort: 4.75
rand: 3.6
O(n) always 1.0
O(log n)
stlport::map sort: 4.5
rand: 8.3
sort: 4.0
rand: 3.5
sort: 7.3
rand: 9.8
0.84 2.1
O(1)

The figures marked "sort" and "rand" refer to trees built by sorted or random order insertion.

Relative performance of advanced searches
The advanced searches are only implemented for DDJ and our ternary_tree, and perform at nearly the same speed.

Timings in the table below are clock-time millisecs spent in 1000 different searches (or, can equivalently be read as microsecs per search). The machine was a 2.4 GHz Intel with 512 Mb RAM running Windows XP. For reference, single key lookup in tests above ran at about 0.3 microsec/call for DDJ, 0.77 microsec/call for ternary_tree.

Function partial match
1 wildcard
partial match
3 wc
partial match
5 wc
keys found per call 1.3 12.4 106.3
DDJ sort: 8.1
rand: 9.7
sort: 100
rand: 60.4
sort: 526
rand: 274
ternary_tree 0.67 sort: 6.3
rand: 6.3
sort: 96
rand: 63.5
sort: 523
rand: 244
Function hamming
dist=1
hamming
dist=3
hamming
dist=5
keys found per call 3.5 181.4 4207.7
DDJ sort: 68.0
rand: 44.8
sort: 3320.1
rand: 1815.0
sort: 17908.4
rand: 5937.5
ternary_tree 0.67 sort: 57.64
rand: 40.2
sort: 2957.67
rand: 1655.91
sort: 16512.5
rand: 5225.15
Function levenshtein
dist=1
levenshtein
dist=3
levenshtein
dist=5
keys found per call 4.01 316.0 8294.3
ternary_tree 0.675 sort: 172.93
rand: 158.56
sort: 12088.2
rand: 8427.8
sort: 81116.9
rand: 66182.6
Function combinatorial
0 wildcards
combinatorial
1 wc
combinatorial
2 wc
keys found per call 230.2 2310.2 10093.7
ternary_tree 0.67 sort: 552.34
rand: 417.97
sort: 4189.05
rand: 2880.18
sort: 13585.6
rand: 8629.8

ternary_tree usually performs a few percent faster than DDJ implementation. As explained above, one could as well say they are equally fast. This by itself is good news though, since our insertion and single-key lookup is quite a bit slower than C implementations.


Measurement of 50,000 US library code strings (like "WAFR______5054____33")

Size of TST trees was 10.2 nodes per string, using 7.1 Mb to store 955 k characters (on 32-bit architectures).
With longer strings (ave. 21 chars instead of 6-8 like English), TST failed lookup is no longer faster than successful search. Here, TST's lose against hash_map, being half as fast for failed search, and marginally slower for successful find. Iterators are significantly more expensive, since more nodes must be traversed to step between end-nodes. If advanced searches or sorted order is needed, this is of course a small price to pay.

Again, everything except iteration is normalized on libtst successful-find operation on sorted-insertion tree.
Iteration is normalized on hash_map iterator.

Function find success find failed insert erase Iteration
DDJ sort: 1.1
rand: 1.2
sort: 1.25
rand: 1.07
sort: 1.85
rand: 2.1
(N/A) (N/A)
libtst sort: 1.0
rand: 1.0
sort: 1.0
rand: 1.0
sort: 1.5
rand: 1.8
sort: 1.1 (N/A)
Boost.Spirit sort: 1.4
rand: 1.6
sort: 1.6
rand: 1.4
sort: 16.2
rand: 17-110+
sort: 5.25
rand: 88.5
(N/A)
ternary_tree 0.65 sort: 1.7
rand: 1.8
sort: 1.0
rand: 1.5
sort: 3.0
rand: 3.3
sort: 0.8
rand: 0.9
sort: 2.8
rand: 3.3
stlport::hash_map 0.9 0.5 sort: 1.9
rand: 1.5
O(n) 1.0
stlport::map sort: 3.1
rand: 2.7
sort: 3.0
rand: 2.6
sort: 4.4
rand: 3.8
0.35 sort: 1.4
rand: 1.4


ternary_tree 0.68 -- by rasmus ekman -- Page generated with Doxygen 1.5.6 on 23 Jan 2009