Monthly Analysis of Automated Posts (January 2018)

in #utopian-io8 years ago (edited)

Each month I look at the tide of automated posts that threatens to swamp the Steem blockchain. In November we saw how the rapid escalation of such posts meant that 31 accounts were producing almost 20% of all posts on the blockchain. In December we saw how the efforts of @patrice and @spaminator had nullified these accounts only for a new batch of automators to rise up in their place. How have events progressed into January?

RisingTide2.png

In this study I will look at:

  • The distribution of Posts Per Account for posts made over the month of January 2018. A comparison is then made to December 2017 and November 2017. The aim is to examine the bump in the tail of the distribution caused by mass-production accounts and to review whether this bump is increasing or decreasing.
  • The post numbers and payouts over time from those accounts producing more than 1000 posts in January 2018, again with a comparison back to November 2017 and December 2017. The aim is to illustrate whether these accounts are increasing activity over time or if they are being closed down.


0 Summary of Findings and Conclusions

I start by presenting the summary of findings for readers who have limited available time. The full details of the analysis are included in the later sections of this article.

  • This analysis includes only posts; it does not include comments
  • All payouts considered throughout this article are expressed in the currency as seen on the Steemit interface, i.e. SBD treated as $1.
    dd

0.1 Distribution of Post Numbers by Account

This first analysis looks at the distribution of Posts Per Account for posts made over the month of January 2018. A comparison is then made to December 2017 and November 2017.

January 2018

Screen Shot 2018-02-07 at 13.01.15.png

December 2017

Screen Shot 2018-02-07 at 13.00.15.png

November 2017

Screen Shot 2018-02-07 at 13.02.37.png

The bumps in the tails of the distributions are produced by those accounts that issue large volumes of automated posts. We can see that while the overall number of posts (left hand axis) has increased very significantly over January 2018, the “bump” has decreased, both proportionately and in actual numbers.

Below is the comparison of these tails showing the month-on-month reduction in scale.

Screen Shot 2018-02-07 at 13.22.34.png

We have moved from:

  • November - 31 accounts produced more than 1000 posts in the month making up 18.6% of all posts on the blockchain in the month.
  • December - 45 accounts produced more than 1000 posts in the month making up 12.0% of all posts on the blockchain in the month.
  • January - 16 accounts produced more than 1000 posts in the month for a total of 2.1% of all posts on the blockchain in the month.

(A similar reduction is found if we drop down to 500 posts)


0.2 Post Numbers and Payouts over Time from Accounts Producing More Than 1000 Posts in a Month

In this second analysis I look at the post numbers and payouts over time for those accounts producing more than 1000 posts in January 2018, again with a comparison back to November 2017 and December 2017.

January 2018

Screen Shot 2018-02-07 at 13.37.06.png

December 2017

Screen Shot 2018-02-07 at 13.37.16.png

November 2017
Screen Shot 2018-02-07 at 13.37.24.png

It is positive to see the marked contrast between January 2018 and the prior two months.

For January 2018 we can see a small volume of accounts have attempted to scale up production in December and carry those volumes through into January. After a short period of minor success the rewards and volumes of articles from these accounts have been reduced down to zero. A brief view of some of these accounts indicates that this is due to the work of @patrice with the aid of the @mack-bot for flagging spam.

Overall the volume of articles and payouts from 1000+ output accounts is markedly reduced in January 2018, meaning less spam clogging up the blockchain and a lower volume of rewards stripped from the rewards pool.

For December and November 2017 it is also noticeable that the volume of articles from those accounts previously producing 1000+ posts per month has now been reduced down close to zero. The occasional rewards continuing through January are due to one user that has changed to using their technical skills in a positive manner to produce a blockchain project.


0.3 Conclusions

The distributions of post numbers by account illustrate that significant progress has been made in the battle against automated posts. The number of accounts producing large volumes of posts is falling, and the amount of posts they produce has reduced from nearly 20% to close to 2% of all posts.

The analysis of post numbers and payouts over time illustrate that automated accounts are now being identified and flagged and their rewards reduced to zero. The removal of this incentive should act as a strong deterrent for users seeking to extract rewards from the blockchain through this method.

It does appear that the tide is being turned back!


Outline

  • 0 Summary of Findings and Conclusions (see above)
  • 0.1 Distribution of Post Numbers by Account
  • 0.2 Post Numbers and Payouts over Time from Accounts Producing More Than 1000 Posts in a Month
  • 0.3 Conclusions
  • 1 Scope of Analysis
  • 2 Tools Used
  • 3 Scripts


1 Scope of Analysis

The analysis is based on the data for all user accounts who posted articles in the months of November 2017, December 2017 and January 2018.

The data has been obtained through SQL queries of SteemSQL, a publicly available Microsoft SQL database built and maintained by @arcange and containing all the Steem blockchain data.

The data has been filtered by date using the .created timestamps in the comments table.


2 Tools Used

Valentina Studio, a free data management tool, was used to run the SQL queries. The raw data was then verified and analysed and the graphs and charts were produced using Numbers, the Mac spreadsheet tool.

SQL scripts are included at the end of this analysis.


Summary of Findings

Analysis findings have been included in the Summary of Findings at the start of the report.


3 Scripts

The scripts required to run this analysis are included below.

Distributions


SELECT
    x.Posts as [Posts],
    Count(x.Author) as [Number Users],
    sum(x.PendingPayoutValue) AS [PendingPayoutValue],
    sum(x.CuratorPayoutValue) AS [CuratorPayoutValue],
    sum(x.TotalPayoutValue) AS [TotalPayoutValue],
    sum(x.Posts) as [Number Posts]

FROM 

(SELECT
    Comments.author AS [Author],
    Count(Comments.author) AS [Posts],
    sum(CONVERT(REAL,Comments.pending_payout_value)) AS [PendingPayoutValue],
    sum(CONVERT(REAL,Comments.curator_payout_value)) AS [CuratorPayoutValue],
    sum(CONVERT(REAL,Comments.total_payout_value)) AS [TotalPayoutValue]

FROM
    Comments (NOLOCK)
    
WHERE
    YEAR(Comments.created) = 2017 AND 
    MONTH(Comments.created) = 11 and
    depth = 0   

    
GROUP BY
    Comments.author ) as x 
    
 GROUP BY
    x.Posts


1000+ Accounts


select
    Comments.author,
    CONVERT(date, Comments.created) AS [CommentDate], 
    Count(distinct Comments.author) AS [DistinctCommentAuthor],
    Count(Comments.author) AS [Posts],
    sum(CONVERT(REAL,Comments.pending_payout_value)) AS [PendingPayoutValue],
    sum(CONVERT(REAL,Comments.curator_payout_value)) AS [CuratorPayoutValue],
    sum(CONVERT(REAL,Comments.total_payout_value)) AS [TotalPayoutValue]

from 

(SELECT
    Comments.author,
    Count(Comments.author) AS [Posts],
    Count(distinct Comments.author) AS [DistinctCommentAuthor],
    count(Comments.parent_author) AS [ParentAuthor],
    count(distinct Comments.parent_author) AS [DistinctParentAuthor],
    sum(CONVERT(REAL,Comments.pending_payout_value)) AS [PendingPayoutValue],
    sum(CONVERT(REAL,Comments.curator_payout_value)) AS [CuratorPayoutValue],
    sum(CONVERT(REAL,Comments.total_payout_value)) AS [TotalPayoutValue]

FROM
    Comments (NOLOCK)
    
WHERE
    YEAR(Comments.created) = 2017 AND 
    MONTH(Comments.created) = 11 and 
    depth = 0 
    
GROUP BY
    Comments.author ) as x 
    inner join
        Comments (NOLOCK) 
            on Comments.author = x.author           

 where
    x.posts > 1000 and
    YEAR(Comments.created) >= 2017 AND
    depth = 0   
    
 GROUP BY
    Comments.author,
    CONVERT(date, Comments.created)
    
 order by
    
    CONVERT(date, Comments.created),
    Comments.author

That's all for today. Thanks for reading!



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

It's great to see that there are less and less automated posts and I hope that the decrease in those posts will continue. Steemit has such a great concept and basically just posting for rewards and filling up the feed with useless posts makes it hard for people that really have great content to get noticed.
Thank you for the interesting post!

There is some great work being done behind the scenes to clean up the blockchain. It's good to see some success!

Thanks for stopping by and reading!

Great news!

@spaminator and @mack-bot should be proud.

It's a great result. With the overall increasing volume of posts and users and the increasing rewards available I was expecting a big upsurge in automation. So it was a nice surprise to see the reverse!

I find that when the analysis shows something different than what you are expecting, it makes the work more worthwhile. @patrice and co. will no doubt be smiling today. Cheers!

Great. Are there any ways to deal with automated commenters?

It would be interesting to see the numbers and understand the scale of the problem. It certainly feels like there's a huge amount of automated comment spam.

I would hope that the method used to deter automated posts could also be employed to target and flag the worst spam commenters - it may just come down to how much time and funds are available to carry out the work.

I think comment spam is huge although I hadn't realised that the spam posting had gotten so big. I really hope that there is enough time and funds to do that

Thank you for the contribution. It has been approved.

You can contact us on Discord.
[utopian-moderator]

Coins mentioned in post:

Name #CoinsPrice (USD)📉 24h📉 12.02.2018
SteemSteem$4,177,80%
BTCBitcoin$8788,829,70%
ETHEthereum$874,099,32%
XRPRipple$1,0913,53%
BCCBitcoin Cash$1294,126,34%

For more Analysis

@miniature-tiger, Like your contribution, upvote.

An automated bot response. One of four automated accounts from the same user. I guess I'll have to extend the analysis to comments next time around!

Great work! From 20% to 2% is a big deal! You guys are heroes!!
It's only right to keep things fair. Looking forward to a great future in Steemit. Joy

Hey @miniature-tiger I am @utopian-io. I have just upvoted you!

Achievements

  • Seems like you contribute quite often. AMAZING!

Community-Driven Witness!

I am the first and only Steem Community-Driven Witness. Participate on Discord. Lets GROW TOGETHER!

mooncryption-utopian-witness-gif

Up-vote this comment to grow my power and help Open Source contributions like this one. Want to chat? Join me on Discord https://discord.gg/Pc8HG9x

Nice information thanks for sharing keep up the good work.

Coin Marketplace

STEEM 0.09
TRX 0.32
JST 0.031
BTC 108179.13
ETH 3894.26
USDT 1.00
SBD 0.61