After a short break Steemfilter is back again with a new feature requested by international Steemit users: from now on you can filter fresh posts by language.
For those who didn’t yet hear about Steemfilter, it’s a tool that filters out possibly low quality posts and helps find fresh quality posts on Steemit.
There are ten languages added so far:
The Eastern Asian languages like Chinese, Korean, or Japanese, are not supported by the Pear/Text_LanguageDetect PHP library I use. I was starting with Google language detection which worked perfectly for a multitude of languages, but I was quickly approaching the limit of free use as the script handles thousands of post on each load.
I could easily add 40 more languages available, but unfortunately there is a factor of load time. Posts are loaded from the blockchain in bunches of 100 and ten cycles is the maximum I can afford because otherwise the load time becomes too long and the website halts. So posts in a specific language are selected from the first thousand of posts older than 30 seconds. It turned out in experimental way that there are only ten languages with post frequency high enough to fetch at least one language specific post in a thousand of total posts.
Besides, I switched Steemfilter to the new API and added a Use Cases page.
Steemfilter’s Use Cases and How-tos
- Help onboarding new Steemians: select reputation 25-25 and the tag “introduceyourself”.
- Support new authors: select reputation 25-45.
- Curate more efficiently and find new interesting bloggers to (auto)vote for: select reputation 45-60.
- Connect to established members: select reputation 60-75.
- Follow specific topics: select a tag.
- Find fresh posts in your language: select a language.
How it works
- Short post check. It’s so much easier to produce short posts and it could look more profitable, but my experience says a short post gets a good payout very rarely, at least for new Seemit authors.
- Filters out posts with no images. The same reason as above — finding a good licensed image requires effort.
- Plagiarism check. The script filters out posts voted for by @cheetah or @steemcleaners.
- Language detection.
The code is written in PHP on top of WordPress platform. Here's Steemfilter’s announcement post.
Steemfilter is listed on SteemTools — come to vote for it and find new cool Steem related tools!
If the project’s user base will continue growing, I’ll experiment with text analysis tools like readability tests, topic detection, and English grammar check.
- My deep gratitude to @mahdiyari for helping out with connecting to the new Steem API.
- Front photo by Michael Coghlan / Flickr