Steemit Retention Part 2 – Business Intelligence Steemit

in #bisteemit7 years ago

Yesterday we looked in detail at Retention for Authors, and we calculated the 60 day retention rate to be 23.3%. However as we all know, Steemit is not just about Authors.  It is also about Curators.

In this article, we are going to analyse steemit data for votes to try and calculate the retention rate for Curators.

If you missed yesterday’s analysis you can find it here

https://steemit.com/bisteemit/@paulag/steemit-retention-part-1-business-intelligence-steemit

In the final post, we will combine the results for part 1 and 2 of this analysis and see how Steemit preforms against other social media platforms

Curator Retention

Just so you have the values handy, the chart below shows the number of new accounts registered by month for both 2016 in black and 2017 in turquoise.

High level overview

  

The first thing I, and probably you, notice from this data is that nearly every account that has registered on Steemit has placed at least 1 vote.  There are only 2 accounts out of 361k that have not voted.

In the pie chart it is clear to see, although every account has voted, there is a big drop off.  Over 57% of accounts have not voted in the last 60 days.  There is however over 28% (102K) of the registered accounts actually voting in the last 30 days.  This is nice and high as only 52K accounts actually posted or commented in the last 30 day.

2016

  

There were 118K accounts registered on Steemit in 2016, however only 11k of these accounts (<10%) have voted in the last 60 days

The chart on the right shows by month of registration, the % of users that are actively, or not actively voting.

2017

  

The visualisations for 2017 shows a much more promising situation.  58% of accounts registered have voted in the last 60 day.  You must keep in mind that 32% of the accounts registered in 2017 have also registered in the last 60%.  This would mean that 26% of accounts registered Between Jan and mid-July are have voted in the last 60 days.

Clusters and Outlier's

The last analysis I carried out on the data was to look at the overall clusters and outliers between the number of days accounts have been registered on steemit and the voting ‘active days’.

To calculate the ‘active days’ of a curator I took the last vote date for each user and deducted the account creation date.

 

The Red mass going up the Days Registered on the left give an indication of the number of curators that were only active for a very short period of time, like days.  The further to the right that you move on the chart shows users that have remained active on steemit over time

I was expecting to see a cluster around the 0-100 days registered because there has been a spike in new users during this time

Steemit Curator Retention Rate

In yesterday’s post I introduced you to the calculation for retention.  We are going to calculate the 60 day retention rate for curators

Retention Rate = (CE/CN)) X 100

CE = number of users at end of period (that have voted in the last 60 days)= 152K

CN = number of new users acquired during period = 358K

therefore : 152k / 358k = .424 x 100  = 42.4%

Curator retention calculated to be 42.4% of total users.

All of the data taken in this report was from the superb steemsql managed by @arcange.  

I have used Power BI to connect to steemsql and then used DAX calculations for the analysis.

 I am part of a Steemit Business Intelligence community. We all post under the tag #BIsteemit. If you have an analysis you would like carried out on Steemit data, please do contact me or any of the #bisteemit team and we will do our best to help you...

you can find #bisteemit  on discord https://discordapp.com/invite/JN7Yv7j

Follow, upvote and resteem 


Sort:  

Very nice post.. already i am upvoted your post,,i am new at here,,i'm following you,,,please follow me back N upvote my post, thanks

@paulag, while it's mostly a bloodbath today STEEM and especially I/O Coin are two of the winners right now... what do you make of it??
https://coinmarketcap.com/currencies/iocoin/

I dont know anything about ioc, but steem is still down too

Thanks for doing this work.

I think either I misunderstand the way the calculation works, or the data needs to be filtered by time a bit.

If you take the data for the last 60 days and run your calculation, you will find the retention rate to be 100%, since every account votes at least once, and no account therefore will not have voted in the last 60 days.

I reckon you need to exclude the last 60 days of new accounts from the calculation in order to prevent the data from being skewed.

Or do I misunderstand?

Yes, I have pointed this out in the article

"therefore : 152k / 358k = .424 x 100 = 42.4%"

But this figure includes those who joined since mid-July.

"You must keep in mind that 32% of the accounts registered in 2017 have also registered in the last 60%. This would mean that 26% of accounts registered Between Jan and mid-July are have voted in the last 60 days."

This would seem to be the correct percentage, rather than the calculation above, since it does not include the accounts that joined within the last 60 days.

Except that it excludes the accounts that joined in 2016.

"There were 118K accounts registered on Steemit in 2016, however only 11k of these accounts (<10%) have voted in the last 60 days"

Which is significantly lower than even 26%.

My guess is that, excluding the accounts that signed up in the last 60 days, the real retention rate is therefore between 26% and <10%. Is this correct?

I would tend to agree, however the 'official' formula for retention as shown about includes all uses to the end of the period being analysed.

Can you provide the number of accounts that joined in 2017 up to mid-July, and the number of them that have voted in the last 60 days? With that information I can figure the adjusted retention rate.

Thanks!

if what we want to know is how many users stay for more than 60 days, accounts less than 60 days old are not useful data to ascertain that information. No accounts less than 60 days old have stayed for 60 days.

If what we want to know is how many users are inactive for at least 60 days, again, accounts less than 60 days old aren't useful data. Even if those users are no longer active, they can't have been inactive for more than 60 days.

In either case, including accounts less than 60 days old in calculating 60 day retention rates skews the data, as all the accounts less than 60 days old are 100% retained, and 0% of accounts less than 60 days old have voted for more than 60 days, according to the criteria, simply because they aren't 60 days old yet.

This is why I asked for the numbers excluding accounts less than 60 days old. You provided numbers for 2016, and provided a percentage for accounts more than 60 days old in 2017, but not the number of accounts, so I cannot add the 2017 number to the 2016 number and come up with a total.

Can you please provide the number of accounts for 2017 excluding those that are less than 60 days old?

Thanks!

Edit: presently, the best calculation of long term retention is the figure you provide for 2016 accounts, as all of them have been accounts for more than 60 days. This is <10%, and that is the apparent 1 year retention rate on Steemit.

For 2017, the percentage you provided that excludes accounts less than 60 days old is 26%. I expect the real percentage of total accounts more than 60 days old - the only accounts that can provide relevant data for 60 day retention rate - is somewhere between 26% and <10%.

you are looking at about 62K accounts in the last 60days

So, if we exclude the 62k from both the total accounts, and the accounts that have voted in the last 60 days, we should have corrected for the effect of the criteria.

So 90k/296k = .30 * 100 = 30% overall. I am flummoxed that this is not between 26% and 10%.

Can you explain why?

Thanks!

I would look at that calculation a different way

a) 152-90 = 62 (no of active accounts at end of period - number of new accounts during period)

b) 62 / 296 = .2094 ( a / no of accounts at start of period)

.2094 * 100 = 20.94%.

Thanks, but ...
Those images are too small...

if you right click on the image and select open in new window, you will see the large image

@paulag Very well performed for sticking at it! It's really a new technique for lifetime therefore you are modern-day pioneers. Adore it..

ah thanks

nice information,thanks for sharing

you are welcome

@paulag Sharing to have this witnessed far more (and perhaps open up the eyes of some)! Thanks for the properly put up and documented report! Resteemed.

I try to upvote people that are growing in the ranks, and then also for some dolphins and newbies as it occurs. I don't have enough Steem yet to really think too much about curation.

now I'm just waiting for someone to tell me what this all means!!!! lol.

Hello @paulag. Do you have a post I can refer to in order to connect to steemsql via Power BI?

no not yet ( dm me on discord will sort you out)

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.033
BTC 62934.09
ETH 3118.65
USDT 1.00
SBD 3.85