The Internet of FilterssteemCreated with Sketch.

in #steemit7 years ago (edited)

sieve-2202240_1280.jpg

What is wrong with Facebook's news feed, Google search results and Amazon's suggested products?

Is it that they show advertisements? In part, as the advertisements are chosen and even customized using your personal data, and that data over a long period of time. But what else?

How about the siloing of information? That's not great for humanity as a whole, but great for their shareholders. In order to access your information you have to use their services, and go through their advertised portals. In many cases they own this information or a license to use it forever, which is in many respects effectively the same thing. Anything else?

Well yes, plenty, I could go on. But what I've been thinking about lately, as it is still topical, the so-called "filter bubble", or "echo chamber" effect. That is, that undisclosed algorithms are controlling what you see, are extremely personalized, and are manipulating you.

I don't say that lightly. The manipulation is ostentatiously for the benefit of advertisers, but we've been living with that devil's deal for decades. It has become much more serious though. It's now common knowledge that Facebook uses it's users as lab rats for socially positive experiments, such as increasing voter turn out and organ donor ship. Google similarly uses private censorship (which I distinguish from the government variety) and have often been suspected of working the US government to subtly skew search results (though to the best of my knowledge there has never been proof).

It's so bad and so well known that a new social network Vero True Social says in it's launch video that because they don't advertise, they "didn't have to put any algorithms in it", their app. For anyone who knows anything about computer science, that's obviously very funny phrasing, but it shows how much a of dirty words "algorithm" has become, to represent not just any procedure but those hidden, manipulative and wealth-extracting bits of code that misguided Silicon Valley nerds write and ultimately end up defining our lives, both online and off.


Of course, there are algorithms in any computer program. But there is a growing market for algorithms which work for us and not for someone else at our expense. No-one wants to go back to completely impersonal search results, or an activity feed of literally everything that is happening. So how can this happen.

I believe there is no way these giants are going to reform their practices, they are doing too well at it and even if (when!) the tide turns against them they will be unable to innovate to a completely different paradigm without complete restructure. This is why I'm on Steemit, we have the opportunity here to try a completely new way.

What we're missing is filters. I suspect this is due to priorities on the developer side of things. But it's really quite a large gap. The good news is that because there's very little filtering already we are in a good position to make our wishes known to the developers and see our wishes realized.

Water_filter_contents.jpg

What we already have

There are are 13 ways to get posts as a stream as offered by the Steem API, ordered in chronological order. These are essentially the most basic filtering of the entire database of unique posts (as opposed to the transactions which make up the blockchain).

These are sprinkled around the steemit.com interface, such as your feed, the 4 items on the top bar (see image), when you view your blog or someone else's, etc.

steemit_topbar.png

Here's the related function declarations from database_api.hpp which I'll just leave here without further explanation:

vector<pair<string,uint32_t>> get_tags_used_by_author( const string& author )const;
vector<discussion> get_discussions_by_payout(const discussion_query& query )const;
vector<discussion> get_post_discussions_by_payout( const discussion_query& query )const;
vector<discussion> get_comment_discussions_by_payout( const discussion_query& query )const;
vector<discussion> get_discussions_by_trending( const discussion_query& query )const;
vector<discussion> get_discussions_by_created( const discussion_query& query )const;
vector<discussion> get_discussions_by_active( const discussion_query& query )const;
vector<discussion> get_discussions_by_cashout( const discussion_query& query )const;
vector<discussion> get_discussions_by_votes( const discussion_query& query )const;
vector<discussion> get_discussions_by_children( const discussion_query& query )const;
vector<discussion> get_discussions_by_hot( const discussion_query& query )const;
vector<discussion> get_discussions_by_feed( const discussion_query& query )const;
vector<discussion> get_discussions_by_blog( const discussion_query& query )const;
vector<discussion> get_discussions_by_comments( const discussion_query& query )const;
vector<discussion> get_discussions_by_promoted( const discussion_query& query )const;

Cement_Biosand_Filter_Drawing.png

Where we need to go from here

The bot I wrote FOSSbot Voter has at it's core a novel score based filtering system that allows for custom algorithms. I've talked about it at length before but in case you didn't catch that, you weight various metrics (numerical representations of facts) so that a post is analyized and gives a score of how likely you are to be interested in it based on your preferences.

Ideally we would get something a bit more user friendly! I got a lot of usability questions about it and to be honest it's a programmers tool, it's not easy to use it without some computer science or at least math background.

However I'm not attached to this approach. What I am attached to are these simple principles:

  1. Transparency - you know how posts are being chosen, and why some are left out
  2. Customization - you own the algorithm, or your version of it, and can change it or turn it off at will
  3. Access to expertise - I offered this by being friendly with those asking questions, but this is a huge gap in most systems and covered by their opaque nature
  4. Ownership - delete it, copy it, sell it - the algorithm, or as much of it as you have made, is yours

Risk of the Bubble

Won't increased filtering just bring the dreaded Facebook Filter Bubble here to wonderful Steemit? No it doesn't need to because we can use it in conjunction with our feed, what's new and trending. Just add it as another of these tabs.

In fact what I've found from people who used my admittedly weird algorithm is that they discovered more stuff. The Steemit Feed is far worse for creating a bubble. I myself have often gotten trapped in just what people I'm following are talking about. When I venture into New it takes a long time but I find other stuff and wish I could read more of it, but there isn't the time.

That's another thing this is about - respecting the limited time of users. As anyone in @steemcleaners or @curie knows, reading the New stream is a full time job. It doesn't need to be for regular users.

We can't trust in the following / follower network (friend of a friend of a friend) to get to know new people, we need to discover, and that means jumping across the network horizontally.


Note, no images references, all CC

Sort:  

Yeah, the extra tab approach is the right one, for sure. A "random post" button would also be dope, like stumble upon, remember that?

I sure do, I used it many moons ago. That would be cool, but even better would be instead of fully random that it perhaps took into account some facts of the post to weed out spam and that kind of thing.

Yeah absolutely, I meant within your personalised algorithm or something.

I find your article spot on, on so many levels.
I so agree that we are being compartmentalized. When we had regular antenna TV the whole country watched any of 3 channels. We were on the same page. Now, any of us can get so blind-sided by living in our own little universes. You are right, that this has got to change.

As far as filtering, lately I've been lamenting over lack of the same. For me, what I want to filter out is foreign language posts, maybe they have great content, but I can't comment on them, can't improve myself from them, are of no help to me. Also, I'd like to filter out content with grammar or spelling errors, but should be smart enough not to filter out science fiction stories with fictitious names and places and not filter out examples of computer code, etc. These kinds of filters should be the easiest ones to write. Add to that, filtering out posts with only one photo and no story, etc. You get it.
I'd love to write that one myself. I am a software developer. But I don't have the bandwidth for that learning curve.

Just my two cents, literally, I upvoted you and that's what my vote is worth.

Thanks. Glad you agree there's a need for filtering to change broadly, and that we can be pioneers here 😊

I think it's very reasonable to want to filter out languages you don't understand, I hadn't thought of using at as an example but it's probably the most basic and least controversial example there is, good thinking. Stepping up to grammar and spelling errors, that's a bit more difficult, but totally doable. I was experimenting with some modules before with gave a reading comprehension score and spelling checker, it's could be applied to that.

One thing I didn't mention, which is something that the devs would surely think about, is all the processing power that would be required to do add these filters. Steemit.com looks like it is struggling with basic services these last days I've been on it a little more, which is disappointing. But it would be huge extra processing power to process these filters.

Perhaps there could be a way to filter client side, on the user's browser? That way it's up to us if we want to "invest" our computing power in the processing overhead for this feature, and leaves the server more or less only at their current processing load.

That's quite interesting, that these filters would gobble up processing power. I'd have thought it to be a no brainer, especially on simple filters like i've described. I guess I didn't think that through.

There are alternate path's to the steem blockchain. I'm not familiar with most of them, but I know they are out there. Along that line, there is another thought I'd like to throw out, that is to have decentralized servers accessing the steem blockchain, developed by the community for either love of the community or for financial profit. Doing so should leave the guys at Steemit.com to do the development of their roadmap. Anyone else can develop their own web interface and see which the market likes best.

I've been thinking about these things independently over the last few months, it's that pesky learning curve and my personal bandwidth thing stopping me.

[...] here is another thought I'd like to throw out, that is to have decentralized servers accessing the steem blockchain, developed by the community for either love of the community or for financial profit. Doing so should leave the guys at Steemit.com to do the development of their roadmap. Anyone else can develop their own web interface and see which the market likes best.

Anyone can write a server for the Steem blockchain, and there are two main alternates that I know of: busy.org and chainBB. Both are really good but they are not decentralized. However the fact that there are three main ways to access the blockchain using a web app of some kind is a kind of decentralization.

And there's nothing stopping anyone else from making a new one, except time and resources. 🙂

A lot has changed about internet of today and before. There are a lot of sponsored services that have taken over the real thing. Everything now is about money , it doesn't matter the moral.
Nice post

This is very good post. I have still more to learn about Steemit myself :)

I like your post. @personz I have followed you

Thank you. As a tip, you should improve your posts to not just post YouTube videos you obviously don't own. That's why your rep is at 7.

I suggest you stop commenting like this. People flag these.

Coin Marketplace

STEEM 0.16
TRX 0.13
JST 0.027
BTC 59056.49
ETH 2597.22
USDT 1.00
SBD 2.44