Some Tips for Witness Operators for Selecting Hardware and Speeding Up Replay, But First: virpus.com - Another Interesting-looking VPS Rental Service
I
have bumped into a minor problem with vpsserver and my witness is running but I am unable to connect to it, even pings are going nowhere (for some reason the fraud flag was raised and my trial account is suspended - though according to https://steemd.com/witnesses my witness seems to be still running and connected to the network - it does not need inbound connections).
I probably will have to generate a new key for my witness which will invalidate anything the one stuck in the fraud-remand-center does until they shut it down - and of course I want to be in control of my witness. It has been a useful test, however, I have learned quite a lot about the behaviour of steemd
, the steem blockchain application and SSD and /dev/shm
.
But first, in my continuing search for suitable hosting services, I came across Virpus, which has the best prices I have seen.
Here is the table of the ones Steem Witness operators would be most interested in:
cores | RAM | SSD | bandwidth | price |
---|---|---|---|---|
2 | 512 MB | 15 GB | 1.5 TB | $5/month |
4 | 1 GB | 30 GB | 3 TB | $7/month |
6 | 2 GB | 60 GB | 4 TB | $15/month |
8 | 4 GB | 90 GB | 5 TB | $30/month |
10 | 8 GB | 180 GB | 6 TB | $60/month |
12 | 16 GB | 360 GB | 7 TB | $120/month |
16 | 32 GB | 720 GB | 8 TB | $240/month |
I have not tested them yet, however though I will be getting the 2Gb memory one next week with part of the powerdown I have scheduled.
Some information about steemd
0.16.3 memory usage
On my main, big VPS with 16Gb of memory, I have been experimenting with settings and using a program called iotop
to watch the amount of data being read and written, and seeing how using /dev/shm
affects replay performance.
First: Replay for witness only function
steemd
uses about 2.2Gb of memory when replaying set up this way. When it is running, it only uses about 1.6Gb, and thus it can run on a 2Gb system once the replay is complete. To do this, run the replay with the same build and settings on one machine with lots of memory, and then stop it, and tar the data dir with the shared data like this:
tar cvf witness_data_dir.tar.xz witness_data_dir
This will pack it all up, then you can copy it using scp from your intended witness server machine:
scp replayserver.url:/path/to/witness_data_dir.tar.xz .
(don't forget the '.' at the end, that means copy to the current working directory.)
I am assuming that you are running on a standard ubuntu 16.04 configuration with no password for root and a user with sudo privileges, and you run the witness out of this user's home directory (/home/username
).
Next you unpack it thusly:
tar xvf witness_data_dir.tar.xz
and then it should fire up after a few minutes with the following command:
steemd -d witness_data_dir
Speeding up replay
For this, you need a lot of memory, 16Gb is really minimal but with SSD maybe you can get away with like 4Gb but a very large swap file. The /dev/shm
(shared memory) is a type of ramdisk that gets paged out to swap when memory is needed for active read/write. You can modify the witness_data_dir/config.ini
to change the size of the shared memory, and I have observed how much of this file is actually used in the different ways of configuring (for RPC, witness-only, or minimalistic for seeding).
The witness plugin requires about currently 8-10Gb of the shared file, but keep in mind that this amount will steadily rise in proportion with the blockchain size, which is currently 7.1Gb or so. Suffice it to say that for this reason 16Gb is probably a safe size that won't see you needing to do another replay to rebuild the shared file for at least 5 months or so. I don't know yet because I have not been watching it long enough the rate of increase in this size, I think that the thing that determines the growth of the blockchain most is the increase in the number of accounts and the number of those that are actively voting, posting, transferring and moving money around.
The indexes that link together accounts and transactions is probably one of the biggest parts of the database. There probably is a way to inspect the number of tables and the data they are consuming using the debug mode but I am not that far yet (and may never play with it because I hate C++ and boost), but it is enough that it tells you at the end of replay how much memory is free from the shared file's total size.
The RPC function, on the other hand, requires a lot more. About 26Gb currently is what it uses when it's finished replaying. However, once it's finished replay, which requires about 3.2Gb of memory, the memory utilisation drops back to about 2.4. I can't be absolutely certain, but I think that a 4Gb machine is enough, but for replay the speed of the disk is a huge bottleneck, and as the code works currently, in combination with linux's disk caching policies, it is better to especially for RPC replays, to do it on a machine with a lot of memory, and the fastest possible disk. Otherwise the replay will probably go on for a week or more.
How to use /dev/shm
to speed up replay
First, you need a machine with a lot of memory. It could be your own home PC but you will then need to upload about 20-40Gb depending on which way the steemd
was configured. I personally think that 8Gb of memory and SSD with this trick I will show you that keeps it from using the hard drive for a long time, and makes its utilisation a lot more efficient (leading to a faster replay), and would finish a replay in under 8 hours. I have not yet tested this. I can only say positively for a system with 16Gb and HDD that this works. The SSD will speed up especially the last part of the process when the shared file exceeds physical memory in size of data in it. Using /dev/shm
speeds up the start of the process before this.
Let's say you have a system like my biggest VPS, 16Gb memory, firstly, you have to enable enough swap to reach the size of the utilisation of the shared data file, so for a 16Gb it would be another 16Gb, for an 8Gb SSD machine you would need to make the swap at least 24Gb in size:
(I will not go into how to create swap files or configuring the swap on system install, many VPS's you will have to use swap files and that also means you can't use BTRFS for the volume you will use as swap because, BTRFS reasons).
sudo mount -o remount,size=16G /dev/shm
or
sudo mount -o remount,size=32G /dev/shm
Then you can start up the server like this:
steemd -d witness_data_dir --shared-file-dir /dev/shm
Then the witness will use this swap-backed ramdisk instead of the hard drive to store the shared file, which stays entirely in RAM until the utilisation exceeds physical memory, and it will swap out the least recently accessed parts (and with HDD, the process will start to have to wait up to half the time or more for this to complete, this is why it goes so slow on HDD machines and with lo memory and the replay code IMHO should be better optimised to write to disk far less frequently).
After it finishes the replay, you can stop the server (it will make notes in its output that it is attempting to cleanly shut down), and then copy the data from the shared folder to the normal default location inside the witness_data_dir
directory, blockchain:
cp /dev/shm/* /path/to/witness_data_dir/blockchain/
and if your system is anything like my 16Gb vps, this will take at least an hour, mine pulls the file back off the swap at a rate of about 5Mb/s! But it is worth it, because this file lets the server be restarted without the shared data file location on the /dev/shm
, which means it could be run on a 2Gb or 4Gb (witness/seed or RPC node) SSD VPS.
You just start it up as normal:
steemd -d witness_data_dir
and after a few minutes it will have synced up to date and start to catch new blocks as they come in, and be available for whatever its normal function is (witness, rpc queries or seeding).
In future I will write a guide on how to set up a minimalistic steemd
for running as a seed node. Basically one plugin must be enabled, so I pick raw_block
or you might prefer the one that gives you the decoded format you would use for a block explorer type app (raw_block will do the least indexing, use the least extra space), but you set it to listen only on localhost:8090 or so, since you don't intend this function to actually do any work (unless you do, of course) and it will listen on the port you set, the standard port is 2001 for seed nodes.
We can't code here! This is Whale country!
Vote #1 l0k1
Go to steemit.com/~witnesses to cast your vote by typing l0k1
into the text entry at the bottom of the leaderboard.
(note, my username is spelled El Zero Kay One
or Lima Zero Kilo One
, all lower case)