Why am I doing this?
The hashcash algorithm forms this basis of cryptocurrencies like Bitcoin. I have been getting into this kind of technology, like the blockchain and cryptocurrency, lately (this is the future!) and I thought it would be fun to try to implement the hashcash algorithm in F#. Plus the [proof-of-work system] (https://en.wikipedia.org/wiki/Proof-of-work_system) seems to have lots of applications, so I might as well get to know it.
What does Hashcash do?
Let's say you want to send an email. And let's say the recipient is checking for spam. You want a way to prove you aren't spam. How do you do it? With hashcash you could do a wee calculation and send that in the header of the email. Then the recipient could check that header value to verify you did that work. But how does this help? It reduces the incentive to send spam. Let's say without this hashcash it takes 1 second to send 1 email. But with hashcash it takes 10 seconds to send 1 email, because calculating this header value takes time. This means the cost of sending spam just increased 10 times. This effectively makes sending email not worth it.
How does Hashcash work?
It's actually pretty simple.
Basically, you build a string with a randomly generated string inside of it, do a sha1 hash on this thing, and then check the first 20 bits to see if they are zero. If they are then you have a done your work and can send an email with proof that you did this work, or do a thing, etc. If not, repeat, increasing a counter that is part of the string we hash. Do it until you get all zeroes in the first 20 bits.
The receiving end can then check your proof and accept or deny your messages based on that.
To do this I followed the Wikipedia article.
You can find the code on my Github.
I did this on .NET Core with F#, so the code is cross-platform and you should be able to run it pretty much anywhere as long of you have .NET Core installed (v. 1.0 as of 8/5/2017). I wrote the code on my Windows 10 desktop but ran and compiled the code on my Ubuntu sub-system. We live in interesting times.
To get started:
- Install .NET Core
- Clone the repo
- On the command line get yourself into the code's directory
- Install Paket or use Visual Studio Code to run Paket and then restore the packages.
- Now go to the command line (I hope you are all using ConEmu) and do
dotnet run sender
Watch as the magic of cryptographish things happen and you generate a hash header in, on average, 2^20 tries. On my laptop this takes about 5-10 seconds.
Now you have a header and we can verify it.
Copy the header and do a verification by using the recipient command like this. The verification checks to see if the first 20 bits are zero.
dotnet run recipient 1:20:1708064109:email@example.com::TVJSQ1dCWTc=:MTE2NjEyNQ==
Notice that last bit MTE2NjEyNQ==. That is the counter in base64 and shows that this calculation took 1,166,125 tries, which is actually not too far off from 2^20 (1048576).
The hashcash is based on hashing a thing called the header. It looks like this:
The counter is incremented on each iteration.
I built the header like this:
let header counter (randomString:string) = let ver = sprintf "%i" 1 let bits = sprintf "%i" 20 let date = string (DateTime.Now.ToString("yyMMddmmss")) let resource = "firstname.lastname@example.org" let extension = String.Empty let rand = Convert.ToBase64String(Encoding.UTF8.GetBytes(randomString)) let base64Counter = Convert.ToBase64String(Encoding.UTF8.GetBytes(string counter)) sprintf "%s:%s:%s:%s:%s:%s:%s" ver bits date resource extension rand base64Counter
Notice that I pass in the counter and the random string. The random string stays the same for each iteration.
The random string is calculated like this:
let randomChars length = //"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" let rand = System.Random() let chars = String(Array.concat [ [|'A'..'Z'|]; [|'0'..'9'|]; [|'a'..'b'|] ]) String(Array.init length (fun _ -> chars.[rand.Next(chars.Length)]))
This will create a string of length
length that is made of big letters, little letters, and numbers.
To do a sha1 hash in F# do this:
let sha = System.Security.Cryptography.SHA1.Create() let getHash (input:string) = sha.ComputeHash(Encoding.UTF8.GetBytes(input))
To check if the first 20 bits is all zeroes I did this (a byte is 8 bits)
let checkHash (hash:byte) = let firstByte = (hash. = byte 0) let secondByte = (hash. = byte 0) let thirdByte = (hash. <= byte 15) firstByte && secondByte && thirdByte
Why check the third byte like that?
- First byte is 8 bits
- Seconds byte is 8 bits
This makes 16 bits with 4 bits left. The third byte is 8 bits but we only need 4 of them. If the first 4 bits is zero, that means the first 4 bits can have up to 4 bits equal to one. If the first 4 bits are 1(max) that is 2^0 + 2^1 + 2^2 + 2^3 = 1 + 2 + 4 + 8 = 15. Anything less than or equal to 15 should mean the first 4 bits are zeroes.
That is the most meaty code. Now, how do we pull it all together?
First I needed a way to generate an infinite sequence of numbers. I know how to do that with a loop, a for or a while, but is there is functional, more idiomatic way? I dunno! So I googled around, and yep, there is! Of course, I found this on F# for Fun and Profit. And it is kind of beautiful.
That way is
Seq.initInfinite. You can learn more about that here
under How to avoid using loops.
This will generate an infinite sequence based on a function you pass to it like this.
Seq.initInfinite (fun index -> index + 1)
This will generate a sequence from 1 to infinity.
But how to stop it?
Seq.initInfinite (fun index -> index + 1) |> Seq.takeWhile (fun index ->index < 10)
So easy, right?
In my hashcash it went like this.
let continueTaking result = not <| result Seq.initInfinite (fun index -> index + 1) |> Seq.takeWhile (fun index -> (continueTaking (compute index randomString) && index < 2000000)) //limit so it does not run forever just in case. this should find a solution on average in 2^20 tries |> Seq.iter ignore
The infinite sequence increases by 1.
The sequence stops when the hash computation results in first 20 bits being zero.
Seq.takeWhile passes that sequence on to
Seq.iter, which ignores it.
And that is pretty much it.
This is just an exercise for learning purposes. I have not vetted my implementation against any known good headers or checked against any pseudocode.
I think I could make the code flow better.
Could I game the system with some prudent parallelization?
Can I optimize this with some bit shift math or a bit mask type of operation? I dunno. I kinda like it this way. It is clear to me.
I could refactor the header function so it doesn't rebuild each element of the header each time.
Can I do this without making the counter variable mutable? Maybe with a a fold-like function.
My husband mentioned that the bit checking could be affected by the difference between big endian and little endian byte ordering. I will have to look into this more.
If you have any questions or comments, please tweet me or try to leave a comment. I am happy to chat.