- How many active users are there on steemit and where are they located?
- How are the rewards distributed?
- How many comments are there per post?
- What are the highest paid posts this week?
- What’s the relationship between reputation and reward?
If you want to know the answers to these, and lots more, basic questions, then you need to follow the following eleven people:
These are the people here on steemit who do some truly excellent work extracting data from the steem blockchain and publish regular statistical posts summarising and analysing (different aspects of) the ‘state of the platform’.
The following two tags are useful to keep abreast of data-analyst posts
If you’re interested in tools which show you your own steem-data in easy-to-read format, check out this post - six useful tools for tracking your progress on steemit. If you’re interested in a wider range of tools which allow you to access and analyse aspects of the steem blockchain - go to steemtools.com.
The purpose of this post is simply to provide a list of people who do data analysis on steemit, rather than a list of good data analysis posts - the later would be pointless anyway, because this post will be locked in the blockchain and dead in seven days from time of writing, whereas by providing links to the people who produce the analysis, at least it will have a ‘live functionality’. I've mainly written it for myself to give me a better handle on who does what with data on steemit, but I figure a few other people out there might find this useful too....
Data analysts on steemit, and a rough outline of what they ‘specialise’ in:
I’ve selected @arcange and @penguinpablo to include at the top of the list because these are the two guys who produce daily stats updates, and (as far as I can tell) have also developed their own applications to display steem-data; and @paulag simply because she produces far more unique data analysis posts than everyone else; finally I’ve included all the other data-analysts in alphabetical order...
Everyone defers to @arcange, so he has to top the list: he’s the guy who developed SteemSQL* after all: which is what most of the people below use to extract data from the blockchain.
@arcange produces the Daily and weekly Steem Statistics Reports in multiple languages: just click on his name above for the latest posts, they’re all daily/ weekly. These are the posts which will enable you to answer some of the most very basic questions about steemit, such as ‘how many active users are there’?
Specifically @arcange’s steem stats posts cover the following 11 trends:
- New users (daily for the past 30 days and split into active and non-active)
- Active users - long term trends over the past 9 months
- Number of posts and comments over the past 9 months
- Number of Upvotes (last 9 months)
- Total number of transactions (last 9 months)
- Most used tags over the whole steem blockchain lifetime
- Reputation distribution
- Number of active users according to voting power (i.e. wales, minnows, dolphins etc.)
- Cumulative voting power by user category
- Maximum post payout, for the last nine months
- Average post pay out for the last nine months
@arcange also posts the following regular posts which will probably be of more ‘popular’ appeal:
The daily hit parade - which lists the top ten posts by number of upvotes, comments and payout (they tend to overlap!)
The Daily hit parade for newcomers - the same as above, but for users with a reputation below 50.
In addition to this he has produced some useful tutorials on how to use steemsql, for example: How to create a steem analytic report with excel. Links to more info and updates are available on the sepearate steemsql website
*NB - as of January 2018 , if you want to access steemSQL you need to pay a subscription to @steemsql on steemit. @arcange kept the service free for about a year, but servers costs money, and so it became a sub-service from January 2018 - the rationale, and the details of how to subscribe are outlined in this post
P.S. if you can’t afford the full subscription but want to make a donation to support steemsql, you can do that too!
I’ve included @penguinpablo second on this list because in addition to producing some useful daily and weekly steem stats reports, which are nice supplement to @arcange’s, he’s also developed the two very useful tools which you can use to explore your own (or anyone else’s) data:
Steemnow allows you to find out what your current 100% upvote is worth, along with the specific rewards you received for various posts and comments you made.
Steemblockexplorer is @penguinpablo’s most recent (at time of writing) tool - which allows you to explore various aspects of what’s occuring on steemit, including yr own activity: the basics are explained in this post: announcing steemblockexplorer.com.
@penguinpablo produces daily and weekly steem stats reports
Focusing on the daily reports these cover:
- Daily number of posts including comments
- Daily number of votes
- Daily active users
- Posts relative to comments per post
- Number of new accounts created
- Daily amount of SBD converted to steem
- Daily amounts of steem powered up or down (including the largest)
- Daily steem transfers to and from exchanges (including the largest)
- Steem Price updates
Finally, he also posts daily cryptocurrency price updates, focussing on changes in the last 24 hours. At time of writing, steem really stands out as gaining while other currencies stagnate.
@Paulag produces more unique data analysis posts that anyone else - there’s really too many to be able to list usefully, but here are a few of my recent favourites to give you a flavour:
Monthly - new user analysis report - which focuses on monthly users and posting, and she tags in steem cleaners to deal with spammy accounts, which is a nice touch.
Monthly - an analysis of upvotes from the minnow support account - I’m sure we’ve all beneffited from this most excellent initiative in our time!
Scraping and Analyzing the Steemit Trending Page – Blockchain Business Intelligence - this is a very interesting analysis of which posts trend, and what rep the users have (mainly 67+). What I especially like about this post is the questions to steeminc at the end lamenting the data we don’t have access to, reminding us that we are limited to the data we have!
Just recently @paulag is focussing producing a ‘round up of 2017’ - focussing on analysing the performance of different tags on steemit, for example.
Finally, she does a lot to promote steemit offline, and has recently launched a ‘cracking steemit in 28 days course’ on Udemy, which looks comprehensive, to say the least!
@abh12345 is extremely inter-active on steemit, and one of the main people promoting community development and curation - he produces three regular weekly data analysis posts (links below to latest posts at time of writing):
Firstly: the curation leagues - a two league curation competition with prizes, which is one of the more unique competitions here on steemit, with some great peops who take part.
- League one is decided simply by how many SP you get per 1000 SP for voting over a fortnight.
- League two is decided by your interactivity - how many people you vote for, how many comments you make and so on (with a negative score for self-voting).
So the analysis here is limited to the few dozen people who currently take part, but it’s still interesting to see who tops the table every week (results are weekly, but based on the last two weeks worth of data)
Just drop a comment onto the latest curation league post - although unless you’re glued to your computer for a fortnight, you’ve got 0 chance of winning, so if you’re one of the unfortunates who has to work for a living, maybe save joining in until you’re on holiday.
This is a new thread as of late Jan 2018, so I assume it’s going to be regular….@fulltimegeek delegated a shedload of steempower to around 50 people a while back, and this post (these posts) is (will be) an ongoing analysis of their interactions. It kind of goes with the ‘curation leagues’ - as these stewards are (supposed to be) the uber-curators on steemit. NB the reason that you probably won’t ever top the curation leagues are because some of these people take part, (then again it’s the taking part that counts!)
Thirdly - the utopian-io - contribution 'approved/rejected' analysis - a regular, weekly analysis of the thousands of posts submitted to uption-io and whether they were approved or rejected (roughly 70-30), broken down by category. Technically
@abh12345 also produced this analysis of @anomadsoul’s interactions on steemit - an interesting read, which outlines the number of votes, self-vote percentage and average rep/ voting weight voted for, so maybe look out for more of this sort of thing too.
@crokkon’s responsible for one of my favourite ever data analysis posts on steemit - curation rewards you could have had - which analyses the SP rewards gained for different accounts based on the voting time…. It’s common-knowledge that there’s a ‘max-reward’ sweetspot for voting around the 20-30 minute point, but this post shows that the exact time varies from account to account….
Overall @crokkon posts 4-8 times a month (so he won’t clog up your feed!), and posts excellent, in-depth analysis, focussing mainly (at time of writing) on curation issues, but does also post on a more eclectic selection of topics - for example he’s analysed the recent @steem undelegation of steempower and even the number of people who expose their private keys when making transfers.
@eastmael is a data analyst who produces lots of technical posts about projects on utopian-io, and his data-analysis posts also tend to focus on everything utopian. For example:
I also quite like this cheeky little post analysing his own voting behaviour (following a delegation form @paulag) - covering number of votes, and most voted people, reps and tags.
@eastmael’s also very fastidious at outlining how he got the data, meaning you can easily replicated what he’s doing, and he’s a very generous resteeemer of all things data.
@eroche has produced some of the posts which are of most interest to me as a sociologist concerned with issues of inequality, for example:
Steem - post payouts and steem health is to my mind one of the most interesting posts ever produced on steemit - basically it shows how high rep users get paid 1000 times more than lower rep users.There is a part 2 too, in which @eroche suggests that the level of inequality is at least decreasing.
@eroche is actually doing a series taking a ‘deeper dive’ into steem user data at time of writing. Examples of other interesting posts in this series include..
Distribution of posts by account - basically 93% of accounts post less than once a day.
Geographical Trends - this is only based on users who identify their location, but based on this the USA and the UK stand out as the biggest regions of use, along with The Philippenes, Venezuala and Nigeria.
@miniature-tiger produces more macro-analytical posts, such as ‘yearly round ups’ of how different steem platforms have performed, and long term trends focussing on just one aspect of steemit: such as curation rewards
Examples of posts include:
Utopian-io - full year in statistics - which includes and overview of the number of users and top paid users. He’s done similar posts for esteem and busy.
Auction bots and their users - focusing on what kind of return you get from different bots
Steemit’s fair-weather friends - December update on user retention numbers - a nice, granular piece of analysis which really digs down to show that active users increase with the steem price, but the growth in active users which is largely driven by returning users with ‘dormant’ accounts which have been open for a while.
@morningtundra is relatively new to steemit but as already produced some very nice analysis posts, and some useful tutorials on how to produce charts on steem data. This time series analysis of steemit, Q4 2017 is an good example.
@remuslord is also relatively new to steemit but has already done some nice analysis posts on the growth of steemit users by country and I particular like this post on the ‘most social categories on steemit’ which looks at user interaction by category.
@steemitph produces at least one data analysis post per week, focussing on various different aspects of steem data. Two recent examples include:
This excellent post on the relative rewards received by posts and comments, going back to early 2016.
This post analysing how the price of SBD changes post pay-out preference.
He’s also very generous in resteemin other people’s data-analysis posts.
Anyway, that concludes my list for now, apologies if I’ve missed anyone off, but this has taken far too long in the writing and I’ve still got to deal with the awkward back-end of putting this up on the platform, but now I feel I’ve got a much better handle on who does what with steem data!
The importance of data
In order to assess the progress of steemit, we need data, and these are the people who access it, clean it and present it to us in accessible form, so I think we owe them our thanks.
However, as all of the above analysts know, the available data is limited to what’s on the chain, and, and this can only tell us so much, and its significance is open to interpretation.
What we don’t get to see (for the most part) is the people behind the posts and their motivations for posting; we know next to nothing about how offline and steem-line worlds intersect; and we lack data on why so many people simply don’t hang around here for very long.
To end on a slightly contentious note: could we actually be blinding ourselves with big-data?
Having said that I appreciate the work all the data-analysts on here do, there is a chance that the sum of all of this ‘analysis’ isn’t actually that useful in helping us to understand the social significance of the platform - to assess that, and to really get a deeper* understanding of the platform, we’d probably need more qualitative data from a truly representative sample of users from across the reps/ categories/ etc/ INCLUDING the people who came and left.
Well, I had to keep a sociological ace up my sleeve to end on!