Machine Learning The Blockchain - WordcloudssteemCreated with Sketch.

in #stats7 years ago (edited)

The human mind is an unparalleled pattern recognition machine. Wordclouds are a visualisation technique which allow the human eye to pick out patterns with a little help from statistics. For this first post the Human mind will be the machine.

What can you pick out of the clouds?

Post Titles in July

Blockchain

When I look at Steemit, I see a rich public datasource. An immutable public Blockchain that is waiting to be explored.

Data in the 21st Century

All facets of life are present here; cultures, languages, religions, occupations. One of the greatest aspects of Steemit is the fact that it is on a public blockchain. The data is available for anyone to view and immutable. It cannot be changed, censored or hidden. As Steemit grows it can become difficult to keep track of the authors you really like. We get flooded with data. I am now following over 900 people so my feed is usually full of posts, too many for me to reasonable read and respond to in a day.

Post Body Text in July

Machine Learning

The machines are rapidly catching up with the human brain and in this series of posts I will be using advanced machine learning techniques to find patterns in the Steemit Blockchain, establishing relationships and connections. It will also help me to find posts and authors to follow.

Thanks to @furion and steemdata.com the blockchain data is publicly accessible and using some cutting edge machine learning techniques I will explore the blockchain over the coming weeks, looking for interesting patterns in the data.

Do you want to know

  • Who is the most positive author?
  • What combination of tags correlate to the highest payouts?
  • What is the best time of day to post, for comments, payout rewards, votes?
  • What topics draw the largest comments, payouts and votes?

Machine Learning Challenges

Anyone interested in data science will have heard of https://www.kaggle.com/.

Wouldn't it be interesting to submit Steemit Blockchain data to this site for a competition with a bounty?

If anyone has any good ideas of a problem, and wishing to support a bounty I would love to hear your suggestions.

Recent Related Posts



Thank you for reading. I write on Steemit about Blockchain, Cryptocurrency, Travel and lots of random topics.


Sort:  

kaggle looks very cool. Will join very soon. Presently i am researching about data science.

An interesting topic with a bounty on it, would get great interest from that community and would be a good way to attract people to Steemit.

steemit, love, bitcoin,

We sure do love talking about steemit.

No doubt!

Awesome post! We see wordclouds all the time, but seldom hear about the technology behind these awesome visualization techniques. I've also dabbled a bit with wordclouds -- if anyone wants to learn more about them, check out my post here. I hope I see more data visualization posts going around!

Good post @eroche
Follow me @Julfan

One thing that pops out from the title word cloud is that positive words vastly outnumber negative words.

My first proper machine learning exercise is going to be sentiment analysis. I am going to try to find all the most positive authors.

kaggle rocks. So does data science and machine learning. What did you use for this? Did you use Power BI word could visualization?

I haven't yet, I will give it a try on this data and see what kind of results I get.

I wonder can it handle non latin text?

i don't think it can handle non latin text

You can run sentiment analysis too using power bi and microsoft cognitive services...I have a post on my website on how to do it

Coin Marketplace

STEEM 0.19
TRX 0.13
JST 0.030
BTC 62832.46
ETH 3374.71
USDT 1.00
SBD 2.48