If you are still not amazed by the power that the Python Language is capable of, then in this part we are going to learn how to generate a Bitcoin address or a wallet in python. I just love how easy it is to communicate with your computer if you have a Linux OS through python and how many interesting projects you can make with it.
In this article I am going to analyze the source code of Electrum, the Bitcoin wallet that is purely written in Python, and it should work with any python 2.x and I believe even with python 3.x package, by default, all dependencies that this software uses are in the default packages. So no additional software is needed it's self-sustainable.
Disclaimer: Use this code and information at your own risk, I shall not be responsible for any damages resulting from the use of the modified code, nor the information provided in this article. It's not recommended to modify the code that generates private keys if you don't know what you are doing!
Playing with the Code
I have downloaded the latest version of the Electrum's source code from Github:
The seed generator file is basically located in
lib it's named
mnemonic.py and the function is
make_seed(), it’s this block of code:
Which you can actually call from the terminal as well, through an internal command. So if you have Electrum installed, then I think it’s like this:
electrum make_seed --nbits 125
This would create a 125 bit seed for you, if you have Electrum installed, but you can also call that mnemonic script through another python file, and customize it for example (like generate multiple ones, or integrate it with some other code).
We will create a new file named
testcall.py from where we will call this Mnemonic code, it has to be in the same
lib folder though. It looks like this:
And if we call it from the terminal using
python testcall.py command:
Basically we are importing the
Mnemonic class from the
mnemonic.py file just calling it as
mnemonic. I haven’t talked about classes yet, they are in the more advanced parts of the Python language, basically they are object that bind together functions. Here the
make_seed() function is contained inside the
Mnemonic class, and it’s called through that, together with other functions that depend on eachother. It could be done with just 1 function, but using it like this is more elegant and less error prone since it can handle exceptions. I am not a very good expert in Classes, so I’m just gonna leave it like this.
Mnemonic class you can define 1 parameter, the language, which has the following values:
You can see the country codes in the
i18n.py file, but only these have wordlists available for now, visible in the
wordlist folder. Basically here is how you create a Chinese seed just replace that argument with the country code:
print Mnemonic('zh').make_seed('standard', 132, 1)
And this will give out some seed in Chinese:
There are also multiple types of seeds you can generate, which you can see in the
standard- Normal wallet
segwit- Support for upcoming Segregated Witness softfork based addresses of Bitcoin
2fa- Two Factor Authentication based Wallets
The next argument is the
num_bitsvariable which from the command line is called with
nbitscommand, basically just the number of bits entropy your seed will have (recommended minimum 128 for security)
The last argument is the
custom_entropy, basically just an integer with which you multiply your seed number, just in case your RNG is bad, this replaces a part of the secret with the customly generated number by you, of the same entropy size.
So if I call it like this, where I chose a custom entropy number, this would generate a seed this way, of course the entropy number has to be a secret as well:
print Mnemonic('en').make_seed('standard', 132, 2349823353453453459428932342349489238)
I don’t really recommend using this code, it looks kind of weird to me, I am not cryptographic expert but I just don’t like how this inserts entropy into your number. I have heard that multiplying numbers decreases entropy, so I am not sure about this part of the code. In fact I am going to message the dev about this issue, see what his response is about this. However no worries, the default wallet generation doesn’t call the custom entropy part, so if you are generating a wallet in Electrum through the GUI, or leaving it at
1 value, then this is of no concern to you.
Auditing the Seed Generator
Ok so now that we know how to generate a seed, let’s see what exactly does the seed generator do. After all anyone using Electrum has to rely on the security and integrity of this code, otherwise you can lose all your money if this code were to be written badly. So we really have to trust this code 100% if we want to store a lot of Bitcoin in Electrum. So let’s analyze it.
So let’s analyze the
make_seed() function, this is where the action is, first of all I will put many
Basically I just print out the each variable at each step. Ok so we are calling the
make_seed() function from our
testcall.py file with
python testcall.py command. Where the testcall file is like this:
print Mnemonic('en').make_seed('standard', 132, 1)
Just a standard seed generation, it prints out these:
Well let’s take it step by step.
- First the
version.pyis imported where the codes of the file is, it basically translates that
01which will be the prefix of the seed later. So it sets the prefix to a
- Then the
bwp(bits per word) variable takes the log2 value of the length of the word list, I mean how many words there are in there, in this case the English list:
english.txt. There are 2048 words in the English list, and log2 of that is 11.
- Then the
num_bitsis divided by
bwpand rounded up, turned into an integer and multiplied by
bwpagain. I don’t know why this is necessary since it gives back the same value, I guess it’s just some kind of precaution.
n_custombecomes 0 if we leave the
custom_entropyat default 1, so that no extra entropy is added
nagain, it remains the same as the
num_bitsinput if no custom entropy is added.
- So basically if you generate a default wallet with no extra entropy, then the
nvariable becomes the main number holding the amount of entropy you define initially through
num_bits. So in our case it remains equivalent since we don’t add anything.
my_entropywill just pick a random number between 0 and 2n, where
nis the same
n, so it will be a large number, this is the prototype to the seed.
- Then we go into a while loop to search for a random number that starts with
01which will serve as a checksum of the seed.
- If the custom entropy is 0, then basically we just add 1 to the
my_entropynumber until the first 2 bits become 0 and 1. Actually the first 2 bits of it’s hashed format. So that happens is that it encodes it with
mnemonic_encode(i)and right after it decodes it with mnemonic_decode(seed) I guess to test if the number can be encoded in words, otherwise it would give some error. That is what the
assertcommand does, it tests for errors.
- Then it goes into the
is_new_seed()function, if you generate a seed now, if you import and older seed in the old format then it goes into the old function. But this code that I executed above goes into the new function. This is where the magic happens. The
is_new_seed()function is actually located in the
- What happens here is interesting, first the seed gets normalized with the
normalize_text()function in the
mnenonic.pyfile, I think the Chinese or other strange languages get transmuted into ASCII text I believe. So this function does not much with the English wordlist.
- Then is when things get interesting, it takes the HMAC-SHA512 hash of the seed list, in the English text version of it basically in our case. And it checks the first 2 characters to be
01, since we called a
standardwallet. Electrum defines the standard wallet as a seed whose HMAC-SHA512 encoded with
Seed versionstarts with
01, a Segwit wallet whose HMAC-SHA512 encoded with
Seed versionstarts with
02and so on… So basically that
whileloop increments that
my_entropyvariable by 1 until the wordlist that it gives back whose HMAC-SHA512 encoded with
Seed versionstarts with
01in our case. After it found that number, it exits the loop, and it returns the seed:
because sister decrease neither cool more car galaxy one upset high allow
That’s it, that is how basically Electrum generates a seed. And this seed’s HMAC-SHA512 sum will start with
01, you can even check it yourself. So in Linux you can install a tool called GTKHash to calculate hashes, so let me demonstrate, we take the the seed, and add the HMAC message
Seed version as defined in that function:
So as you can see if we add the HMAC message
Seed version together with the seed it gives us the 512 bit hash that will start with
01 so in this case this is a valid default seed compatible with Electrum.
Of course the HMAC system is unbreakable, especially the 512 bit version of it is probably quantum computer resistant, so there is no way to reverse engineer the seed from this system.
However there is 1 issue, if we fix the first 2 characters of the hex format, where obviously the HMAC-SHA512 output is in hexadecimal format, well that loses entropy.
So that is why we start with 132 bits of entropy, because we lose about 4 bits of entropy, and hence the output at the end will only have 128 bits of entropy which his what we want by default, it’s safe to use 128 bits of entropy, in fact it’s recommended to only use above 120 bits now, given how powerful computers get.
So we start with 132 bits, we lose some bits due to fixing the first 2 characters, and then we remain with 128 bits which is computationally secure. To brute force this it requires a supercomputer to go through 2128 combinations which is pretty much impossible since there is not enough energy on Earth to go through that many combinations, in fact some people say that you can’t even count until this number range, not to mention hashing and other memory intensive operations
It looks like Electrum is safe to use. It has passed my audit, although I am no crypto expert but from what I have researched and learned it looks safe to me.
I am still skeptical about that
custom_entropy thing, I should ask the dev what that does exactly, but other than that, default wallet generation is flawless. There are no backdoors in my opinion.
After all many thousands of people use Electrum, especially people holding large amounts there so it better damn be safe to use, and in my opinion it is.
I have analyzed it’s main seed generation code in this article. Of course the code is a lot more than this, but we already know that if you generate a seed on an Offline Computer with it, it should be safe. Now I haven’t looked into the network related parts of it, but I trust them to be safe.
It’s a cool wallet, use it if you want: https://electrum.org
- Electrum software is the
Copyright of Thomas Voegtlinlicensed with MIT license.
- Python is a trademark of the Python Software Foundation