You are viewing a single comment's thread from:

RE: 3 Reasons Steem Price Will Go Up - And One Reason It Will Not Reach The Moon

in #steemit7 years ago

You have some incorrect information. Seed nodes (consensus/p2p, witness, and exchanges) actually run with a shared memory file size of 16GB, and that doesn't actually need to be placed in RAM even - it can be on SSD. Full RPC nodes without AH filtering can use over a 100GB file but again can be placed on fast SSD instead, the actual RAM requirement is much much lower. The misconception comes from the fact that it is the lowest latency/fastest to place the whole database in RAM if possible, but as time goes on this will become less of a standard and people will simply use fast storage instead. The RAM requirements will not explode exponentially. There has been a lot of FUD surrounding this topic lately and it should be dispelled.

Sort:  

I have this information from private discussions with other witnesses and also from my own experience setting up seed nodes and operating a witness node. I didn't wrote about witness nodes, only about full RPC nodes. It is not my intention to spread FUD, on the contrary.

It's the first time I hear from somebody very close to the Steemit dev team that this type of data can be stored on SSD. Can you please give me more information? Is this something that changed recently? Are there any new developments in the underlying data storage libraries? I'd love to get fresh information from reliable sources but also with some relevant tech links (I can read code so feel free to point me to the relevant GitHub files).

 7 years ago (edited)

Most VPS providers use SAN storage that they label SSD, which is SSD but you're going out over their local network to hit their array of SSD's - this is typically high latency. If you have a physically attached fast SSD it's acceptable to place the entire shared memory file on it and run without much difficulty.

So what you're telling is that the current latency of SSD is enough for the shared memory file for a witness or RPC node? I never used VPS, just bare metal machines.

Yes, a sufficiently fast SSD is enough for the shared memory file on a full RPC node (not just witness). However, it will handle less requests than one run entirely out of RAM - but that can be mitigated using a caching layer (like jussi for example).

Here's a test you can do to 'prove' it if you want. I just tried it using a 32GB digital ocean droplet. You can spin one up there for a few hours and then kill it (although they are expensive, they bill hourly - so as long as you don't forget to kill the droplet a few hours would probably be a few dollars or so).

Install docker from get.docker.com. Run docker run -d --env USE_WAY_TOO_MUCH_RAM=1 --env USE_FULL_WEB_NODE=1 --env USE_PUBLIC_SHARED_MEMORY=1 --env USE_NGINX_FRONTEND=1 -p 2001:2001 -p 8090:8090 --name steemd-full steemit/steem:0.19.1-p2pfix-bumpram and then to follow along use docker logs -f steemd-full. That particular branch will have a current state file that can be pulled in at the moment (this changes regularly since it pulls from the dev environment where different branches are tested so expect it to change/become stale at any given point). NOTE: The reason that option isn't specified in the README is because we don't recommend trusting anyone else's state file for transactions. For any production setup, you should really generate your own state files. This is ok for testing though and getting a full node up a little more quickly.

Anyway, that will pull in a ~24gb state file while decompressing it at the same time to shortcut getting the node started. It's pulling externally from a bucket in S3 us-east-1 WHILE decompressing it so expect it to take a while (maybe 45 minutes-ish). It will take an additional hour or so for the node to 'catch up' most likely. So in 2-3 hours and you will have a fully synced full RPC node running on a 32GB RAM droplet running just off the SSD. More RAM is better for performance because of OS disk paging but for the example I wanted to use a minimum for a full node.

The digital ocean droplets still do not have physically attached drives (afaik) but they have particularly fast disk i/o for a VPS provider and is enough as an 'example'. Results will be much better (faster syncing) on bare metal with physically attached disks for better disk i/o. tldr; I would not use a minimum VPS to run a full node. But, the example does prove the point :)

Thanks a lot for taking the time to write this, I'll try it next week. As steem.supply is getting more and more traffic I need to set up a specific node for it. I also have a couple of other Steemit projects in the pipeline so my own RPC mode is becoming a necessity. I may not set it up on bare metal earlier than mid-September but I will spin up a few instances to test the behavior. Appreciate your efforts guys, I know this is not an easy project.

That's really helpful... I'll probably try that later. How long would you expect that VPS to take to complete a full replay if I asked it to? would it be several days?

In addition, making use of technology like a caching database layer to drastically reduce the amount of requests to host a full site along with the fact that steemd will soon be multithreaded will make it easier to scale. All of the technical challenges will be overcome.

Thanks for putting this information out. I think some of the concern comes from the perceived difficulty of running the Full RPC nodes, without large amounts of RAM. I'd love to see the specifications of a sufficiently capable hardware configuration (that we could expect to be sufficient for say 6 months). I understand for example that the using the SSD for swap requires RAID too? I'm considering running one to support services I'm developing.

That's another way to do it - if you carve out a tmpfs large enough to support the shared memory file plus use swap on a less-than-fast SSD, you are basically letting the OS handle keeping the most relevant pieces in RAM. That's one way to do it. If you have fast enough disk storage already, it's unnecessary to do that. Another option is if your disk storage isn't fast enough by itself you can RAID two together (if available) to gain the performance needed.

Coin Marketplace

STEEM 0.20
TRX 0.14
JST 0.030
BTC 68148.22
ETH 3249.65
USDT 1.00
SBD 2.67