Blockchain Bachelor’s Thesis – Information Overload and Methods of its Elimination in the Modern Information Society: Going Through the Sources pt. 6

fingersik (66)in #education • 7 years ago

Source

Previously published

Article

Malizia, A, Olsen, K, Turchi, T, & Crescenzi, P 2017, 'An ant-colony based approach for real-time implicit collaborative information seeking', Information Processing & Management, 53, 3, pp. 608-623, Library & Information Science Source, EBSCOhost

An ant-colony based approach for real-time implicit collaborative information seeking

„Different studies pointed out users’ low degree of satisfaction with search engines. Fox, Karnawat, Mydland, Dumais, and White (2005) devised a machine learning approach that employs users’ actions (for example the time spent on a page, scrolling usage, and page visits) and concluded that users consider 28% of search sessions unsatisfactory and 30% only partially satisfactory. Xu and Mease (2009) measured the average duration of a search session and found that users typically quit a session — even without having satisfied their informational need — after three minutes.“

The whole topic of “satisfaction with search engines” is very interesting and from my point of view relevant to the information overload itself. The major problem I have with the study presented in this article (but the problem is generally present in the information science) is the fact that all the outcomes were measured only technically. In other words, those that were running the study only relied on the data measured by algorithms and technical statistics. I do agree that those practices can yield interesting data, on the other hand though we can’t possibly say with 100% certainty that the information needs were satisfied just from the fact that the user searched for information, opened one of the articles that was found and stayed there approximately for the time needed in order to read through the information. Also the information need could have been satisfied without even accessesing any article that was found by the search engine. Even though we can’t take the numbers for granted, the “real numbers” wouldn’t probably be too different.

„The ScentTrails system (Olston & Chi, 2003), which continuously allows users to supply keywords and enriches hyperlinks to provide a path that achieves the goal described by them. Some techniques aiming at improving their performance have been summarized (for example, Learning to Rank) to highlight a few key concepts: (1) the relationship between users seeking information and the optimal foraging theory; (2) the need for a search engine to adapt itself to users’ behavior; and (3) the need to perform such adaptation in real time.”

The ScentTrails system is one of the 2 fundaments that this study had. I will describe what it is very soon. The core of this quote though is the fact that in order to make the “searching experience” more friendly the 3 key concepts need to be met.

“Almost none of the aforementioned approaches take into account all three of these aspects, as stated by Wu and Aberer (2003) and Olston and Chi (2003). Beyond a doubt, a Swarm-based approach can take into account all three key factors and is, nonetheless, a much more elegant and simple method than all of the other “ad-hoc” alternatives.“

Swarm-based approach is the 2 fundament that together with the ScentTrails create the optimal searching approach according to the study. Now the explanation finally comes…

„Each day ants leave the colony in search of food and building materials; they will exploit the surroundings in all directions in a somewhat random fashion. If an ant finds anything of interest, it will return to the colony depositing pheromone, a chemical substance that the other ants are able to detect. Thus they create trails to signal the path between the colony and the food. The quantity of pheromone deposited, which may depend on the quantity and quality of the food, will guide other ants to the food source. That is, the other ants in the colony may now use the pheromone as a trail marker to reach the food. This marker evaporates over time, so that uninteresting trails disappear. Shorter trails will get a higher level of pheromone, thus shorter trails will endure longer, providing a notion of optimization. Without a doubt, humans are more intelligent and organized than ants.“

This is where the ScentTrails and Swarm approach are finally connected. Easily put, any action one does while searching for information via search engines leaves a “digital pheromone” on the path to the destination. Since every single user that searches for the information has this ability, the swarm effect is created. Every other user that searches for the information can use the “digital pheromones” left in the digital space to access the desired information more quickly than any other user that was searching for it.

„Let us assume that a set of users all start with the same query, for example “compact camera GPS”. That is, they are all interested in finding Web sites that can offer a good bargain for such a camera (“food”). Our group may start with a Google query, and click on links to explore the results. These click streams will define our “pheromone” or virtual trail. They may for example be implemented by adding score values to each link, or visualized by representing the links by large fonts, stronger color, etc.“

Further explanation of the combined approaches if you didn’t get it just yet.

„It’s pretty intuitive to find a parallelism between the way ants forage for food and the way users employ search engines to satisfy their informational needs; yet the latter, unlike ants, don’t leave any trace at all, so they can’t provide any clues to the next users with their same informational needs, and — since about 30–40% of queries issued to a search engine are already been submitted (Xie & O’Hallaron, 2002) — that’s a pretty common scenario.“

I never thought about it, but it does make perfect sense that around 1/3 of all the searches that are submitted have already been submitted previously. There is no logical reason not to use the “digital pheromones” apart from it being hard to code and use intuitively.

„On the Web though, users cannot provide a solution to the ranking problem but can assist by providing their own view of relevance. In fact, the first document that a user selects among the results in a given search session is the one that, based on the available clues, is perceived as the most relevant to the user (Church, Keane, & Smyth, 2004). The most relevant document according to a user is the one that should be in a highly relevant position in the optimal solution to the ranking problem for that given query. The next document, selected in the same session, is considered less relevant since it was selected after the previous document.
The SessionRank algorithm employs the relative order of clicks performed by each user during a session and increments the pheromone’s quantity accordingly. Therefore, choosing an exponential decay.“

Google actually has some kind of the swarm-based scent trail system, but it is highly ineffective. Don’t get me wrong it is better than nothing, but here lies my problem. Since all the data is measured just by the system (algorithm) it can only assume what the result (or feeling of the user is) In the Page rank system it “assumes” that the firstly opened document is the most relevant. That is utter bullshit. When the document was opened as a first one in the searching session, it only proves that it was the most “attention grabbing one”. One cannot know whether it is relevant PRIOR opening the document and at least briefly reading through it. The most relevant very well may be the lastly opened document. The worst thing about it is that when one of the non-relevant document gets a “strong tag of the digital pheromone” it just confuses all the future searching individuals and there is no way it ever gets “left behind” because all the future searchers will be fooled to think that it is the most relevant document and click on it, thus increasing the strength of the “digital pheromone”.

Those were all the non-technical relevant information to the problem of information overload. What to take from it? The swarm based scent trails system is awesome. My problem is that it is being measure just by the algorithm. From my humble point of view it would be much more effective if the users themselves could spent their time to leave the “digital pheromone” consciously. Just imagine it was part of a decentralized system! People could spent their time in order to increase searching efficiency of the whole swarm (decentralized community). They would be incentivized to do so because they would be mining a cryptocurrency. The whole community would be MUCH MORE effective in the spreading of the “digital pheromone”, therefore would be much more effective than the community that might be bigger, but cannot customize the dissemination of digital pheromone. Such a community would not only be much stronger with any additional active community member, but it could also “repair the wrongly disseminated digital pheromone” unlike the algorithmically ran system!

#sndbox #information #overload #thesis

7 years ago in #education by fingersik (66)

$23.23

Sort:

Trending

[-]

dedicatedguy (74) 7 years ago

Ok dude this is somehow heavy stuff, but its expected given the fact this is a thesis.

Google actually has some kind of the swarm-based scent trail system, but it is highly ineffective

It might be difficult to catalog it as ineffective because at least in my case, I don't have anything to compare it and really know if it is effective or not.

From my humble point of view it would be much more effective if the users themselves could spent their time to leave the “digital pheromone” consciously

Yes, but it will be much slower as well than just using algorithms. The rewards for taking the time to leave the pheromone would need to be good so people can justify themselves to spend those few seconds on every site they visit, it might seem like little time just spending a few seconds but I can add up really quickly.

Maybe artificial intelligence is the solution?

$0.20

1 vote

[-]

steemitboard (66) 7 years ago

Congratulations @fingersik! You have received a personal award!

1 Year on Steemit
Click on the badge to view your own Board of Honor on SteemitBoard.

Upvote this notificationto to help all Steemit users. Learn why here!

$0.00

STEEM 0.17

TRX 0.16

JST 0.028

BTC 74801.64

ETH 2594.90

USDT 1.00

SBD 2.43

Blockchain Bachelor’s Thesis – Information Overload and Methods of its Elimination in the Modern Information Society: Going Through the Sources pt. 6

Previously published

Introduction

Thesis

Sources

Case study: Interview

Article

An ant-colony based approach for real-time implicit collaborative information seeking

Coin Marketplace