Eleven steem-data analysts to follow if you want a deeper* understanding of steemit

in steemit •  10 months ago
  • How many active users are there on steemit and where are they located?
  • How are the rewards distributed?
  • How many comments are there per post?
  • What are the highest paid posts this week?
  • What’s the relationship between reputation and reward?

If you want to know the answers to these, and lots more, basic questions, then you need to follow the following eleven people:


These are the people here on steemit who do some truly excellent work extracting data from the steem blockchain and publish regular statistical posts summarising and analysing (different aspects of) the ‘state of the platform’.

The following two tags are useful to keep abreast of data-analyst posts

If you’re interested in tools which show you your own steem-data in easy-to-read format, check out this post - six useful tools for tracking your progress on steemit. If you’re interested in a wider range of tools which allow you to access and analyse aspects of the steem blockchain - go to steemtools.com.

The purpose of this post is simply to provide a list of people who do data analysis on steemit, rather than a list of good data analysis posts - the later would be pointless anyway, because this post will be locked in the blockchain and dead in seven days from time of writing, whereas by providing links to the people who produce the analysis, at least it will have a ‘live functionality’. I've mainly written it for myself to give me a better handle on who does what with data on steemit, but I figure a few other people out there might find this useful too....

Data analysts on steemit, and a rough outline of what they ‘specialise’ in:

I’ve selected @arcange and @penguinpablo to include at the top of the list because these are the two guys who produce daily stats updates, and (as far as I can tell) have also developed their own applications to display steem-data; and @paulag simply because she produces far more unique data analysis posts than everyone else; finally I’ve included all the other data-analysts in alphabetical order...



Everyone defers to @arcange, so he has to top the list: he’s the guy who developed SteemSQL* after all: which is what most of the people below use to extract data from the blockchain.

@arcange produces the Daily and weekly Steem Statistics Reports in multiple languages: just click on his name above for the latest posts, they’re all daily/ weekly. These are the posts which will enable you to answer some of the most very basic questions about steemit, such as ‘how many active users are there’?

Specifically @arcange’s steem stats posts cover the following 11 trends:

  • New users (daily for the past 30 days and split into active and non-active)
  • Active users - long term trends over the past 9 months
  • Number of posts and comments over the past 9 months
  • Number of Upvotes (last 9 months)
  • Total number of transactions (last 9 months)
  • Most used tags over the whole steem blockchain lifetime
  • Reputation distribution
  • Number of active users according to voting power (i.e. wales, minnows, dolphins etc.)
  • Cumulative voting power by user category
  • Maximum post payout, for the last nine months
  • Average post pay out for the last nine months

@arcange also posts the following regular posts which will probably be of more ‘popular’ appeal:

The daily hit parade - which lists the top ten posts by number of upvotes, comments and payout (they tend to overlap!)

The Daily hit parade for newcomers - the same as above, but for users with a reputation below 50.

In addition to this he has produced some useful tutorials on how to use steemsql, for example: How to create a steem analytic report with excel. Links to more info and updates are available on the sepearate steemsql website

*NB - as of January 2018 , if you want to access steemSQL you need to pay a subscription to @steemsql on steemit. @arcange kept the service free for about a year, but servers costs money, and so it became a sub-service from January 2018 - the rationale, and the details of how to subscribe are outlined in this post

P.S. if you can’t afford the full subscription but want to make a donation to support steemsql, you can do that too!



I’ve included @penguinpablo second on this list because in addition to producing some useful daily and weekly steem stats reports, which are nice supplement to @arcange’s, he’s also developed the two very useful tools which you can use to explore your own (or anyone else’s) data:

Steemnow allows you to find out what your current 100% upvote is worth, along with the specific rewards you received for various posts and comments you made.

Steemblockexplorer is @penguinpablo’s most recent (at time of writing) tool - which allows you to explore various aspects of what’s occuring on steemit, including yr own activity: the basics are explained in this post: announcing steemblockexplorer.com.

@penguinpablo produces daily and weekly steem stats reports

Focusing on the daily reports these cover:

  • Daily number of posts including comments
  • Daily number of votes
  • Daily active users
  • Posts relative to comments per post
  • Number of new accounts created
  • Daily amount of SBD converted to steem
  • Daily amounts of steem powered up or down (including the largest)
  • Daily steem transfers to and from exchanges (including the largest)
  • Steem Price updates

Finally, he also posts daily cryptocurrency price updates, focussing on changes in the last 24 hours. At time of writing, steem really stands out as gaining while other currencies stagnate.



@Paulag produces more unique data analysis posts that anyone else - there’s really too many to be able to list usefully, but here are a few of my recent favourites to give you a flavour:

Monthly - new user analysis report - which focuses on monthly users and posting, and she tags in steem cleaners to deal with spammy accounts, which is a nice touch.

Monthly - an analysis of upvotes from the minnow support account - I’m sure we’ve all beneffited from this most excellent initiative in our time!

Scraping and Analyzing the Steemit Trending Page – Blockchain Business Intelligence - this is a very interesting analysis of which posts trend, and what rep the users have (mainly 67+). What I especially like about this post is the questions to steeminc at the end lamenting the data we don’t have access to, reminding us that we are limited to the data we have!

Just recently @paulag is focussing producing a ‘round up of 2017’ - focussing on analysing the performance of different tags on steemit, for example.

@Paulag also runs the YouTube steemit ambasador competition - analysing YouTube data and rewarding those who do the most to promote steemit outside of the platform.

Finally, she does a lot to promote steemit offline, and has recently launched a ‘cracking steemit in 28 days course’ on Udemy, which looks comprehensive, to say the least!



@abh12345 is extremely inter-active on steemit, and one of the main people promoting community development and curation - he produces three regular weekly data analysis posts (links below to latest posts at time of writing):

Firstly: the curation leagues - a two league curation competition with prizes, which is one of the more unique competitions here on steemit, with some great peops who take part.

  • League one is decided simply by how many SP you get per 1000 SP for voting over a fortnight.
  • League two is decided by your interactivity - how many people you vote for, how many comments you make and so on (with a negative score for self-voting).

So the analysis here is limited to the few dozen people who currently take part, but it’s still interesting to see who tops the table every week (results are weekly, but based on the last two weeks worth of data)

Just drop a comment onto the latest curation league post - although unless you’re glued to your computer for a fortnight, you’ve got 0 chance of winning, so if you’re one of the unfortunates who has to work for a living, maybe save joining in until you’re on holiday.

Secondly - @fulltimegeek's#stewardsofgondor weekly stats with self-vote percentage data

This is a new thread as of late Jan 2018, so I assume it’s going to be regular….@fulltimegeek delegated a shedload of steempower to around 50 people a while back, and this post (these posts) is (will be) an ongoing analysis of their interactions. It kind of goes with the ‘curation leagues’ - as these stewards are (supposed to be) the uber-curators on steemit. NB the reason that you probably won’t ever top the curation leagues are because some of these people take part, (then again it’s the taking part that counts!)

Thirdly - the utopian-io - contribution 'approved/rejected' analysis - a regular, weekly analysis of the thousands of posts submitted to uption-io and whether they were approved or rejected (roughly 70-30), broken down by category. Technically

@abh12345 also produced this analysis of @anomadsoul’s interactions on steemit - an interesting read, which outlines the number of votes, self-vote percentage and average rep/ voting weight voted for, so maybe look out for more of this sort of thing too.



@crokkon’s responsible for one of my favourite ever data analysis posts on steemit - curation rewards you could have had - which analyses the SP rewards gained for different accounts based on the voting time…. It’s common-knowledge that there’s a ‘max-reward’ sweetspot for voting around the 20-30 minute point, but this post shows that the exact time varies from account to account….

Overall @crokkon posts 4-8 times a month (so he won’t clog up your feed!), and posts excellent, in-depth analysis, focussing mainly (at time of writing) on curation issues, but does also post on a more eclectic selection of topics - for example he’s analysed the recent @steem undelegation of steempower and even the number of people who expose their private keys when making transfers.



@eastmael is a data analyst who produces lots of technical posts about projects on utopian-io, and his data-analysis posts also tend to focus on everything utopian. For example:

Utopian top projects - a closer look

Comparing Utopian Translation Rewards with Existing Services

I also quite like this cheeky little post analysing his own voting behaviour (following a delegation form @paulag) - covering number of votes, and most voted people, reps and tags.

@eastmael’s also very fastidious at outlining how he got the data, meaning you can easily replicated what he’s doing, and he’s a very generous resteeemer of all things data.



@eroche has produced some of the posts which are of most interest to me as a sociologist concerned with issues of inequality, for example:

Steem - post payouts and steem health is to my mind one of the most interesting posts ever produced on steemit - basically it shows how high rep users get paid 1000 times more than lower rep users.There is a part 2 too, in which @eroche suggests that the level of inequality is at least decreasing.

@eroche is actually doing a series taking a ‘deeper dive’ into steem user data at time of writing. Examples of other interesting posts in this series include..

Distribution of posts by account - basically 93% of accounts post less than once a day.

Geographical Trends - this is only based on users who identify their location, but based on this the USA and the UK stand out as the biggest regions of use, along with The Philippenes, Venezuala and Nigeria.



@miniature-tiger produces more macro-analytical posts, such as ‘yearly round ups’ of how different steem platforms have performed, and long term trends focussing on just one aspect of steemit: such as curation rewards

Examples of posts include:

Utopian-io - full year in statistics - which includes and overview of the number of users and top paid users. He’s done similar posts for esteem and busy.

Auction bots and their users - focusing on what kind of return you get from different bots

Steemit’s fair-weather friends - December update on user retention numbers - a nice, granular piece of analysis which really digs down to show that active users increase with the steem price, but the growth in active users which is largely driven by returning users with ‘dormant’ accounts which have been open for a while.



@morningtundra is relatively new to steemit but as already produced some very nice analysis posts, and some useful tutorials on how to produce charts on steem data. This time series analysis of steemit, Q4 2017 is an good example.



@remuslord is also relatively new to steemit but has already done some nice analysis posts on the growth of steemit users by country and I particular like this post on the ‘most social categories on steemit’ which looks at user interaction by category.



@steemitph produces at least one data analysis post per week, focussing on various different aspects of steem data. Two recent examples include:

This excellent post on the relative rewards received by posts and comments, going back to early 2016.

This post analysing how the price of SBD changes post pay-out preference.

He’s also very generous in resteemin other people’s data-analysis posts.


Anyway, that concludes my list for now, apologies if I’ve missed anyone off, but this has taken far too long in the writing and I’ve still got to deal with the awkward back-end of putting this up on the platform, but now I feel I’ve got a much better handle on who does what with steem data!

The importance of data

In order to assess the progress of steemit, we need data, and these are the people who access it, clean it and present it to us in accessible form, so I think we owe them our thanks.

However, as all of the above analysts know, the available data is limited to what’s on the chain, and, and this can only tell us so much, and its significance is open to interpretation.

What we don’t get to see (for the most part) is the people behind the posts and their motivations for posting; we know next to nothing about how offline and steem-line worlds intersect; and we lack data on why so many people simply don’t hang around here for very long.

To end on a slightly contentious note: could we actually be blinding ourselves with big-data?

Having said that I appreciate the work all the data-analysts on here do, there is a chance that the sum of all of this ‘analysis’ isn’t actually that useful in helping us to understand the social significance of the platform - to assess that, and to really get a deeper* understanding of the platform, we’d probably need more qualitative data from a truly representative sample of users from across the reps/ categories/ etc/ INCLUDING the people who came and left.

Well, I had to keep a sociological ace up my sleeve to end on!

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

Upvoted on behalf of the dropahead Curation Team!

Thanks for following the rules.

DISCLAIMER: dropahead Curation Team does not necessarily share opinions expressed in this article, but find author's effort and/or contribution deserves better reward and visibility.

Help us giving you bigger upvotes by:

Upvote this comment!
Upvote & Resteem the latest dropahead Curation Reports!
Join the dropahead Curation Trail
to maximize your curation rewards!
Vote dropahead Witness with SteemConnect
Proxy vote dropahead Witness
with SteemConnect
Donate STEEM POWER to @dropahead
12.5SP, 25SP, 50SP, 100SP, 250SP, 500SP, 1000SP
Do the above and we'll have more STEEM POWER to give YOU bigger rewards next time!

News from dropahead: How to give back to the dropahead Project in 15 seconds or less


Quality review by the dropahead Curation Team

According to our quality standards(1), your publication has reached an score of 75%.

There are many details that can be improved. Baby steps!

(1) dropahead Witness' quality standards:

- Graphic relation to the text (Choice of images according to the text)
- Order and coherence
- Style and uniqueness (Personal touch, logic, complexity, what makes it interesting and easy to understand for the reader)
- Images source and their usage license


Hi - thanks for upvote, but can you tell me where the 'quality standards' are published and how the score is calculated?! It's very useful to get feedback, but I don't get what '75%' means?!


Just a few corrections for me :)

(results are weekly, but based on the last two weeks worth of data)

The data is weekly. I go 2 weeks back to check on delegations in and out though as this screws the data for a given individual. So you will only need to be 'glued' for a week!

And a smaller one :)

This is a new thread as of late Jan 2018,

Started it in December - just had a different top image :)

Really cool to read about what the other analysts on the list are up to - there's a couple i need to follow!

And thanks for the mention too!


Noted, thanks for the corrections.

Hey @revisesociology, wow, thanks for the mention! It's a great honor to be on your list. Thanks for highlighting the data analysis community here on steem! :)


My pleasure - and thanks for all the analysis... hopefully you'll get a few more followers out of it!

Hi @revisesociology, you honor me more than I deserve sir. It is a proud moment to be acknowledged by an educator like yourself. Thanks a ton! This is a brilliant list BTW. I wait for and check posts from all the authors in this list. Some of them were, still are inspiration on getting started with data analysis posts.


Everyone on here's an inspiration! The more the merrier I say!

thank you very much for the mention, nice round up post. Just one quick thing. The udemy course is not yet live, wont be live till the end of Feb. Very excited about it too. :-)


No worries, I know how much work goes into these things.

I think Udemy's a good option for your serious course material, rather than @DTube with it's eclectic reward pattern.

Excellent summary. Thanks for mentioning my work and providing information about SteemSQL.

Thank you for the honorable mention! I’m also interested in the sociological nature of the population. I’d welcome your ideas on how to go after them.

This is brilliant @revisesociology / wow! Thank you for putting in the work on this! I'll pass this along to my contacts here. Amazing.


Thankyou so much!

P.S. you might want to let yr voting power recharge - the general rule is keep it above 80%!


Thanks for the tip, @revisesociology ! I will do. Thanks!

This is a very useful guide for those interested in the way Steemit actually works. I had been looking for something just like that. Resteemed!


They're all very useful people to follow!


Hi - glad you found it useful!

Friend, I come here to support you, upvote, must!

Hi! Thanks for upvoting my post! I follow you. ; )

u have a great analysys categories in steemit, thanks for sharing @revisesociology, i like it

thank you very much for this post. I find this helpful especially as a newbie.

Excellent summary and shout out. I've got about half these on my follower list already. Will check out the other half this week, thanks.

Thanks for the information

Thanks, I'm always trying to understand Steemit a bit better. I will have to take a look into these authors.

An excellent write-up. It's an honor to be included in the list with these guys (and gal).


My pleasure - actually I hadn't thought to comment on the gender dimension - old gendered subject choices die hard I guess! P.S. Having a little trouble upvoting yr comment for some reason!

Thanks for the info

Thx for this great and very useful summary. A good starting point for evryone who wants to dive deeper into the data of the platform. therfor thank you and resteem. p.s sorry for some reason i does not work to upvote your post...