TML development - introduction of a linear independent hashing algorithm

dana-edwards (74)in #tauchain • 7 years ago

In the most recent TML development, an addition to the code for TML includes a new hashing algorithm. This was a hashing algorithm I had not seen before so I asked Ohad about it and more importantly why it was being implemented. Due to a quirk in partial fixed point semantics, this hashing algorithm is hoped to provide efficiency dividends. The code is below for your analysis:

Logprime (logarithmic prime hashing algorithm)

The fractional part of the logarithmic prime is extracted.

// generate the first 64 bits of fractional part of the log of 256 primes
// starting from 3. is not part of the buildchain (but pre-gen) so not to
// require mpfr in order to build the project. logprimes.h can be generated by:
// g++ lpgen.cpp -lmpfr && ./a.out > logprimes.h

#include <mpreal.h>
#include <cmath>
using namespace mpfr;
typedef uint64_t nat;
bool isprime(nat n) {
    if (n<1) return false;
    if (n==2) return true;
    nat k = ::ceil(::sqrt(n));
    while (k > 1) if (!(n%k--)) return false;
    return true;
}
nat fpl(mpreal x) {
    mpreal l = log2(x), f = l - floor(l);
    f *= (1<<15); f *= (1<<15); f *= (1<<15);
    f *= (1<<15); f *= (1<<4);
    return (nat)ceil(f);
}
int main() {
    mpreal::set_default_prec(128);
    std::cout<<"// this is an automatically generated file"<<std::endl;
    std::cout<<"const uint64_t logprimes[256] = {";
    for (nat n = 3, k = 0;k < 256;++n)
        if (isprime(n))
            if (++k < 256)
                std::cout<<fpl(n)<<"U,";
            else {
                std::cout<<fpl(n)<<"U};"<<std::endl;
                return 0;
            }
    return 0;
}

The code above shows how this happens. It will be interesting in the next few days to see how this is implemented. Linear independence appears to be a necessary feature and this specific method is known to achieve that. The hash allows for storage of results per stage so that they can be efficiently compared. Programmers may be aware of this kind of usage for hashing algorithms for memory allocation in general, so it can be obvious how this can be used to improve efficiency as this allows the for instance a loop by which every item only has to be scanned exactly once, rather than having to scan every item each time. This for example would be useful if a card counting program were to look at each card in a deck to determine whether or not it has been seen before.