Introducing datasteem an open source block parser to store posts and user data on a mysql database.

in #utopian-io6 years ago (edited)

Steempress.png

Hello !

For an upcoming project linked to steempress we needed to have a local db with all the posts on the blockchain that has been done for the last week. For speed of course but also to make complex queries in one go. Of couse you could argue that there is already solutions like steemdata or steemsql but steemdata is out of sync and steemsql is not free.

So I figured that building a custom parser would be the best way to go forward.

I spend a big amount of time on the code which can easily be modified for a lot of various use case. So I figured I'd release the source. Note that you still may find some bugs.

Source which you can now find on my github :

https://github.com/drov0/datasteem

What is the project about?

datasteem is a tool that will fill a mysql database with infos about posts and users. more specifically :

for posts : block_id, author, title, date (unix timestamp god so much simpler to deal with those), text, permlink, image (if any), tag1,2,3,4,5, json_metadata, reward (in sbd), comment and upvotes number.

for the users : username, reputation, steem_posts, steem join date, followers, following count, steem power, delegated steem power.

It can fill the database in real time and in the case of a crash catch up if he missed a few blocks.

Technology stack

It's a typical node program with a mysql database behind it.

I stream the blocks via dsteem and do most of the queries via steemjs. This is because I'm way more familiar with steemjs for querying data and I find it more convenient. And dsteem's stream api is just great, hence the crossover.

I strongly recommend the use of pm2 to handle the process.

How do I run it ?

git pull [email protected]:drov0/datasteem.git
cd datasteem
npm i

install the database via the db.sql file

pm2 start datasteem.js

Roadmap

I plan on polishing the tool and db scheme to scale and handle more data/operations. For instance I would love to store all voting operations to perform some data analysis on how the global stake is moving.

How to contribute?

Submit a pull request, comment your code or write it in a way so it read itself and good to go :D



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

Thank you for the contribution. It has been approved.

  • What do you think of https://github.com/steemit/sbds ?
  • What types of complex query are you doing on this dataset?
  • How big is your database at this point?

Need help? Write a ticket on https://support.utopian.io.
Chat with us on Discord.

[utopian-moderator]

Thanks !

  • I didn't know about sbds, it looks very similar to what I'm doing I'll have a deeper look at it to see how I can contribute

  • Well for instance "get all the posts made by users between 30 and 50 rep, that is at least 1 hour and 6 minutes old and maximum 6 days old. With a reward between 1 and 20$. containing either the tag science technology or health but has to have the tag steemstem and show the first 450 chars as a preview and the author must not be one of those 5000 blackisted users"

  • For 6 days worth of data I'm hanging around 1-1.5 gb depending on the size of the text of the posts. It can take quite a while to actually query all that so I store it using the MEMORY db engine so everything gets stored in ram where I can go over the data more quickly.

An impressive project. I look forward to following the progress.

Thanks !

Hmm...!
What is the advantage of the datasteem and its benefits? I have not understood.

The advantage for me is that it's homemade so it's tailored for my needs, also it's build on mysql, which is a db that is not supported by any open source db project that I have met so far.

Hey @howo! Thank you for the great work you've done!

We're already looking forward to your next contribution!

Fully Decentralized Rewards

We hope you will take the time to share your expertise and knowledge by rating contributions made by others on Utopian.io to help us reward the best contributions together.

Utopian Witness!

Vote for Utopian Witness! We are made of developers, system administrators, entrepreneurs, artists, content creators, thinkers. We embrace every nationality, mindset and belief.

Want to chat? Join us on Discord https://discord.me/utopian-io

That sounds really cool! I would be really interested if this could pull all the newest posts from a tag and show a feed in a widget. Do you think it could be used for that?

Hey thanks for your post.
This is very helpful and informative post.
You also like to see my post.

flagged for spam

Coin Marketplace

STEEM 0.31
TRX 0.12
JST 0.034
BTC 64418.55
ETH 3157.64
USDT 1.00
SBD 4.06