[Lvlug] mai-tree Continues to r0x0rz yu0r m0m

Nachiketa Kalam fingolfin at thelinuxlink.net
Tue Apr 12 08:37:52 EDT 2005


> Ok...I am completely rearranging mai-tree. I thought it would be fun to
> aim for speed. So I am rewriting vector.c to realloc a single array when
> the vector must expand instead of allocating new blocks (see the
> infinite array example in O'Reilly Practical C Programming)

This is exactly what I did. Now getting a node six-deep in an nary tree
takes about 90 ns, even with va_arg. I ended up rewriting nAry_Get with
va_arg to avoid using a pointer to unsigned int, for reasons illuminated
below.

Also, I am reimplementing pn-tree.c. Everything is getting a big
overhaul for functionality and speed. I figured that limiting the size
of the key to 16 bytes was optimal for speed, memory, and efficiency. By
skipping over an interval proportional to the length of the key (the
length divided by 16), I got the results I wanted. I also briefly
considered factoring only the lower half of each character...which
actually worked pretty well, but I dropped it because it might yield
less-than-perfect results with other languages using the full range of
possible char values, like Dan Bernstein's gay++ hashing algorithm.

More critically, I realized that assigning to a dereferenced int pointer
is a /very/ expensive operation. So I inlined the factorization code by
making it a macro; for, as a function, I had to find some way to
manipulate the arguments or return a value. Pointer dereference was slow
and using a structure was even slower. So now, by using a macro that
directly operates on variables in the function, I have a BLAZINGLY fast
algorithm.

Statistics:

All keys created by extracting consecutive segments of char's from
HOWTOs of length according to test, reducing the list to unique members
and selecting a random 500.

Very short keys (4 chars):
 * Synthesis for fifty million iterations: 2 seconds
 * Collision rate: 32%

Short keys (6 chars):
 * Synthesis for fifty million iterations: 2 seconds
   (40 ns per iteration)
 * Collision rate: 15%

Average keys (8 chars):
 * Synthesis for fifty million iterations: 2 seconds
   (40 ns per iteration)
 * Collision rate: 9%

Average-long keys (12 chars):
 * Synthesis for fifty million iterations: 3 seconds
 * Collision rate: 3%

Long keys (24 chars):
 * Synthesis for fifty million iterations: 5 seconds
 * Collision rate: practically 0%

I also tried testing the efficiency of a little more than 3200 8-char
keys. Granted, the efficiency started to slacken, but wasn't that bad.
My little Perl program reports the number of chains with a given length:

1, 1599 <- no collision
2, 351
3, 157
4, 49
5, 33
6, 13
7, 4
9, 1
11, 1

And here, the report for 8000 8-char keys:

1, 2356 <- no collision
2, 683
3, 319
4, 163
5, 109
6, 71
7, 44
8, 28
9, 26
10, 18
11, 12
12, 14
13, 9
14, 10
15, 5
16, 3
18, 4

48 percent unique keys. At the /very absolute/ worst, one would have to
go through 18 members of a vector. The worse half (lengths 9 through 18)
add up to 1116 members.

Then I hashed /usr/share/dict/words for shits and giggles. Here, at least
using a vector to resolve collisions, performance really starts to
degrade...several chains would grow to over fifty members long. However, I
could reimplement the algorithm to 'decompose' the vector into /another/
tree, whose factor indices are selected a little differently. But it's not
a big concern at this point in time.

And how about memory use? Though I am still putting things together, I
estimate a complete mai-tree with 8000 members would take up about 400k,
with all the overhead. I'm not sure if that's really good or bad. I did
compile Perl with -DDEBUGGING_MSTATS and noted that a hash of the same
size consumed a whole 1 MB. I have favorable comparison there at least!

Well, you get the idea. This even beats SuperFastHash (see
http://www.azillionmonkeys.com/qed/hash.html), which was benchmarked to
run at some 270 ns per iteration. I also only tested key synthesis and
then used the results to intern data separately (I'll write the new glue
code sometime) Yes, his tests were on 256-byte buffers. But that doesn't
matter for me. I did tests on a hundred 256-byte buffers, which, though
they were effectively reduced to 16-bytes, produced not one indistinct
key.

And who the hell uses 256-char keys anyway?

Moreover, I have another substantial advantage: no need to do modulo by
prime. The values produces by key factorization are good unto
themselves. Consequently, I don't have to go fishing for the next prime
number every time a mai-tree grows; the vectors will simply expand as
necessary, and the growth is rather well distributed through time (I have
'incremental growth'). Indeed, the largest growth occurs in the two's
vector and it's a joke compared to resizing a hash. (It gets maybe 40
members long??)

All told, mai-tree grossly exceeded any expectation I could ever have in
terms of competing with other hashing algorithms (though it is not
exactly a true hash). If this is the only good algorithm I ever produce,
I don't care. I'll still be happy.

Now, licensing and hosting issues come into play. I want to hear
arguments from BSDL, LGPL, GPL, everyone. Where should I host this?
nongnu? sf?

--
"I would rather wear a coolie hat than a McDonald's visor. And if it
comes to pass that I must revere a king, it better not be the Burger
King."



More information about the Lvlug mailing list