Coding Illy, a post discovering Python script that find posts randomly :)

in #programming7 years ago

Steemit is a wonderful place - you cannot deny it. The amount of content and possibilities here are just awesome. Of course, you can fight me back with powerful points like 1. the mobile site sucks 2. there are just too many scams and spammers around here 3. you can only have 100% of voting power and it drains a little too fast 4. I'll leave the rest for you to fill up. But it's just important to have fun and look at the bright side of stuff. What makes Steemit fun is the amount of content, talent, beautiful stories, awesome creations made by people from across the globe that spread across the blockchain. You can never expect less from the world - there are too many hidden gems around.

Every post here has a voting period of 7 days before officially making the reward into the author's account. Sounds fair, since people will need time to find posts they like and give their organic votes. Well I know, bot voting is the way to go, but heck, that removes the fun element and just upvotes too much trash. Ignore the bots and stuff, there is also one major problem of Steemit - new posts spawn too quickly and a post can hardly get any attention without direct link sharing. Most of the times, if you don't share a post and you don't have a base of organic readers and don't join any communities, your posts won't get any attention at all - if they don't get any within the first hour. Think about it, we have 1 million accounts here - more or less this amount - and the post created every single second is just insane. There's a reason Steem creates one new block in the blockchain every 3 seconds - to handle the number of new transactions, posts, comments, votes and flags that is happening every single second from every corner of the world.

If a post doesn't get any attention in the first hour, will it get any attention afterwards? I guess no. Unless you are super lucky and fall into the radar of some huge curators like @curie and @yoo1900. Even me sometimes feel bored looking at all the effortless posts on the Created page and does not have the intention of scrolling further. It just doesn't feel fun looking at tons of trash trying to find a diamond in it.

That's the reason I created Illy.


Illy is a post discovering script, written in Python. Remember that Steemit is based on Steem and the latter is a blockchain? This is how Illy works. She (or it?) does not search from the beginning of all posts or something. Illy does something slightly more efficient to randomly find posts that are still upvotable. With that being said, it is absolutely not something that can help you to maximize curation rewards or something (in fact, Illy will return stuff that will give you close to zero curation rewards for upvoting it, since the posts will be pretty old, up to 6 days). Illy is just a naive script, created by a naive guy to find interesting content across Steemit :)

Oh, you ask why the name "Illy"? It's a tribute to my first ever coding project (Illyphia the Discord bot), and I just like reusing names :P

  1. Asks you for the desired post length, tag, and how many posts you would want to have, if you have any requirements on these. Normally asking for long posts with the tag "introduceyourself" returns fairly fun results with great self-introductions and potential future writers. If you are looking for something to flag, using the tag "fiction" with post length requirement as "long" might give you a ton of plagiarism. It's totally luck dependent anyways.

  2. Depending on the tag you have given, Illy does one of these three things - without a tag or a tag that is among the top 100 popular tags in Steemit - Go to 3; a tag that is not popular enough - Go to 5.

  3. Illy will get the head block number of the Steem blockchain using the Steemjs API, and select a random block from this one to approx 170k blocks before. If you don't know yet, all comments, votes and posts are available as transactions in a Steem block. If there are such transactions in the selected block, Illy will randomly pick one of the voted posts or comments or comment submission as the starting post to search from. If the selected post is a comment, then Illy will find its parent post as its starting point. If the block does not have any transactions containing post-related data (it happens), then Illy will just pick another block.

  4. From that post, Illy will start to dig further posts in a fashion similar to how we scroll the Created page. The difference is that Illy starts digging from a different point so she discovers old posts pretty efficiently, while we might be scrolling for hours to reach there. If you didn't specify a tag, Illy will return any posts that fit the length criteria you stated. Else, Illy will look for posts that have the tag, and use it as a starting point to search in the specific tag, return posts according to what you have said.

  5. If the tag is not a common tag, Illy will directly ask Steemit for the top 100 trending posts of the stated tag. If there are no posts returned, it means you are naughty and asked for a nonexistent tag :) and Illy will just tell you that she can't find anything. Else, she will randomly pick one of the posts on the trending page of that tag and start the search from there. Similar to how Illy treats search queries without tags, Illy goes through posts similar to us scrolling the Created page, just with a different starting point. After collecting sufficient posts, take your results (ノ´ヮ`)ノ*: ・゚

At any point if Illy sees posts that are too old to be upvoted (author is being paid), she restarts the process from Step 2, retrying up to 3 times before totally giving up and returning what is found for that moment - this means for certain tags that can be considered popular but is rarely seen can cause some trouble for Illy. It might take a long time and download a ton of data before finding stuff that you are looking for.

Here's a random vid of bug testing I just did, apparently the code logic works pretty well :)

Of course, I won't consider it as completed, it is far from being a completed product - I wish that Illy can detect if Cheetah marked the post, detect the post's pending payout so minnow's posts can be found more efficiently (in the vid the post I chose to open had quite a huge payout while the rest had quite little payouts), can search for posts with certain terms specified by the user, give it a pretty GUI and allow me to vote and comment from it, etc. There are so many stuff that can be done on this piece of code. Guess I'll have a ton of fun playing with it in the future :P



Well...sometimes to-do lists just make me lazy.

I might release it as an open source software after I can make sure that it is safe to be used :3 Previously the logic went wrong and it downloaded >120MB of posts before I realized that I have to stop it before my mobile data runs out. It still has weird behaviours (like downloading lists of posts for a few extra times before returning results) and sometimes crashes. Anyway, it has been some very good learning experience trying to code a project in a semester break and I guess I learnt quite a lot from it. The biggest lesson might be "don't test code involving internet connections with mobile data" but other lessons are important too, like setting up logging can help to identify problems quickly compared to white box testing (aka checking code logic one line at a time, human brain sometimes refuse to work without caffeine), etc. Most importantly, it is some real fun building it from scratch and enjoy the wonders this little script found for me.

Gotta go and find a few more interesting posts with this one :D

See you next time.

P/s: If you want to get your hands dirty playing with this piece of code, you may reach me anytime in the Steemit-Friends Discord server, or directly shoot me a DM @Lilacse#0020.


Header image: It's a photo captured by myself, feel free to reuse.
Wallpaper in video: View on pixiv
Icon theme in video: paper-icon-theme on GitHub

Sort:  

I have suggestion, you can add in steemdb or steemsql, to check cheetah vote or the pending payout.

I dun have much time to read through, but i definitely read it later. Drop me the link to the github also haha

Actually the Steemjs API has everything, I can just use that :P I implemented the pending payout thing once, but deleted because I mistook that as a cause of slowing down searches while the actual reason is my searching algo LOL.

Gotta rebuild the logic and perhaps accidentally make a better one :) who knows, the brain is a strange thing lul.

World of Photography
>Visit the website<

You have earned 5.10 XP for sharing your photo!

Daily Stats
Daily photos: 1/2
Daily comments: 0/5
Multiplier: 1.02
Server time: 17:07:17
Account Level: 0
Total XP: 70.85/100.00
Total Photos: 14
Total comments: 0
Total contest wins: 0
When you reach level 1 you will start receiving up to two daily upvotes

Follow: @photocontests
Join the Discord channel: click!
Play and win SBD: @fairlotto
Daily Steem Statistics: @dailysteemreport
Learn how to program Steem-Python applications: @steempytutorials
Developed and sponsored by: @juliank

电脑奇才

太过奖了 OTL

Coin Marketplace

STEEM 0.22
TRX 0.20
JST 0.034
BTC 98944.63
ETH 3375.99
USDT 1.00
SBD 3.10