Tip: Ideal hardware for faster replays

in #steem6 years ago (edited)

Witnesses are suffering from a severe case of replay fever. Here are some anecdotal observations on speeding up your replays. To reiterate, please note, these are hardly controlled benchmarks - rather just my flawed personal observations tinkering with witness nodes for the last 2+ years. It'd be great if fellow witnesses and developers chime in with their experiences.

The primary bottleneck for replays are single thread CPU performance. You can boast about your 32 core machines, but Steemd will only max out 1 thread, and use spare change on a 2nd thread. Worse still, many core CPUs come with far lower (often half) the single thread clocks. So, a 4+ GHz dual-core CPU will end up running circles round a 2.3 GHz 32 core CPU.

Currently, the fastest CPU on the market for single thread performance is the Intel Core i7 8700K. It can be achieve typical overclocks in the ~5.0 GHz range. Of course, it's likely you are renting servers from a cloud provider, so overclocking is out of the picture. Quite likely they also sell Xeon instead of Core. The fastest server CPU is basically the 8700K's Xeon counterpart - the Xeon E 2186G or 2176G. Going back further, you should look for a CPU from the Skylake generation, at least, and focus on the 1C Turbo Boost clock. Note that Skylake, Kaby Lake and Coffee Lake have identical IPCs, so it's all about hitting that peak 1C Turbo Boost clock. Intel's ARK database will give you both information.

AMD's Ryzen 7 2700X is a good choice too, but Intel wins on peak single core performance.

The next most important requirement is a fast I/O subsystem. For best replay performance, you should store shared-memory on RAM - but it goes beyond that. Firstly, make sure the CPU is accessing all slots for maximum memory bandwidth. E.g. if you have a Skylake-X CPU, populating it with only 2 sticks will cut your effective memory bandwidth down to half.

The speed and latency of your RAM can make a significant difference. Most important are the RAM latencies. There's a significant boost from RAM with fast latencies - I'd recommend 14-14-14-14-32... The first number, the CAS latency, seems particularly sensitive to replace speed. The memory speed is also important, but note that higher speeds also come with higher latencies. So, instead of rushing out for DDR4-4000, you might want to consider DDR4-3200, but with tighter latencies.

If you don't want to use RAM, NVMe (particularly in RAID) is pretty fast too, if you combine it with a fast single-threaded CPU. I haven't tried Optane, but it might offer performance comparable to RAM.

Of course, this information is hard to gleam off cloud servers. You might want to get in touch with the service provider and ask them for exact specifications.

Since Steemd is heavily bottlenecked by some part of the server, alleviating these can lead to massive increases in performance. Hope this was helpful.

Sort:  

Nice tip. I think on a long enough timeline, dev work would need to come up something that doesn't require full replays at some point..

Not hardware. But there's a little tweak suggested from Steem's github to boost replay times.There's a 5-10% boost here:-
echo 75 | sudo tee /proc/sys/vm/dirty_background_ratio
echo 1000 | sudo tee /proc/sys/vm/dirty_expire_centisecs
echo 80 | sudo tee /proc/sys/vm/dirty_ratio
echo 30000 | sudo tee /proc/sys/vm/dirty_writeback_centisecs

Won't full replays become more impractical as the blockchain grows anyway?

Yes, exactly, there should be partial replays. They have made some progress with modularisation on it, at least now you don't need a replay if an API you don't use was affected. E.g. 20.4 wouldn't need a replay from 20.3 if you didn't use the witness and RC plugins/API. I think before making partial replays possible, they need to get the other plugins on RocksDB too, starting with witness.

I remember that tweak, it originally had a typo that actually made things slower... Not sure if that was corrected on the GitHub page.

Not just on the replay necessity, but also on the single thread bottleneck there should be a solution. Especially indeed since the blockchain will only grow, even partial replays will be a pain which is better spread over more threads. We do want the platform to grow, so the increase of the blockchain will follow.

To listen to the audio version of this article click on the play image.

Brought to you by @tts. If you find it useful please consider upvoting this reply.

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 63126.26
ETH 2596.37
USDT 1.00
SBD 2.76