All singing, all dancing ~ BOTS! ~ Act 1

in #bots8 years ago (edited)

Bots bots bots, Steemit is all just bots, amirite? 😜 Some people really feel that they are a parasite on the system, some that they are beneficial, some again are dubious but are happy to benefit from them, and so on.

As someone who has created one and intends to develop it further in a way which will hopefully be "good" for Steemit, I want to dive into topic and get my thoughts in order.

To wit, this is the first of a short series of articles discussing bots on Steemit, starting broadly and narrowing in on whether or not they are a good thing, a necessary thing, might destroy Steemit eventually, and other such pertinent questions.

Act 1: Fun with definitions

A recent study of bot activity on the internet as a whole by security company Incapsula reported that by a small margin most traffic is generated by bots. @stellabelle posted her thoughts on the matter here and when read about her post, it occurred to me, well how are they defining bot traffic? Or bots in general?

One example given us is Facebook's feed refresh fetcher. I thought it was extremely generous to call an integral function of a complete app a bot. So I set about trying to figure out what a good definition is.

Of course this is a bit pedantic but a claim like "the internet is mostly bots" is a pretty strong statement with consequences in how people view the internet, so I think it's worth checking out a bit. Defining it well may also contribute to the current discussion on bots on Steemit a bit because it necessarily leads us to consider their purpose.

Exclusive interview with Incapsula's head of marketing!

I decided to chance it and get in touch with Incapsula about the report. I was very fortunate to have an email exchange with Igal Zeifman, head of marketing - it was kind of an interview 😅. It is reproduced here minus the pleasantries (the full text might be available if you use the new Steemit diff tool 😉):

personz

Hi, I am reaching out to clarify why your organisation considers feed fetchers to be bots? Surely a distinction must be made between code which runs periodically or as a delayed task, and bots which have more autonomous goals, such as crawlers.

Basically I wonder if the following definition from your article "What is an Internet Bot?" is too broad: "A bot , also known as Internet bot, is a program that runs automated tasks over the Internet."

Igal Zeifman

That's a good question.

Feed fetcher are l [one] non-human visitors and, in many respects, they are not significantly different from any other bot that runs tasks on demand. (e.g., some marketing tools that will scan on demand or health checkers that can be scheduled or used for one-time checkups)

The fact the scope of their automation is limited doesn't meant that they are 'manual'. (eg., think about how much content Twitter fetches automatically when you log in)

I absolutely agree with your that a distinction should be made between various bot categories. We tried to showcase that by providing examples of individual bots in each of sub-segments.

As an interesting side-note, the original draft of our infographic actually categories bots as "semi automated" vs "automated". Eventually, however, we felt that this over-complicated things and took the focus away from the core discussion.

personz

Thank you for clarifying that point that you are talking about non-human visitors. So the distinction is that if a human can manually visit an internet resource in a meaningful way, and then some software component also does that, we can call this bot behaviour. However with feed fetchers does this really apply? For example, one does not get meaningful data from a feed request from the Facebook APIs, it is intrinsically tied into the program.

By this further clarified definition it is clear that scrappers, crawlers and the like are clearly bots as a shallow review of URI requests would show no difference between fully human requests, only behaviour markers which your company seem to excel in finding (this is very interesting by the way).

I would say that limitation on scope of automation is precisely what distinguishes bots from non-bot automation. Otherwise we are left with the case in which packet routers and all sorts of basic low level functionality could be considered bots. For example, would you call the software which sends an email at a delayed time a bot? In this case it is one time delayed automation. The feed fetcher seems to me to be periodic delayed automation.

And thank you for the behind-the-scenes note about semi-automated vs. automated, perhaps this distinction really clarifies enough. But I'd still be interested to hear your response to the above.

Igal Ziefman

You are right :) Delayed and triggered automation is automation still.

In re to your other question, behavior analysis is important but mostly when dealing with malicious bots.

Most good bots will declare their identity via user-agent headers. Also most operators of these bots will have some official documentation, which commonly includes IP of origin.

Still on the topic of behavior, bots interact with websites in a way which is often similar to humans. (e.g., parsing HTML and even CSS and JS, requesting images and etc)

In the end, most bots - from feed fetchers to SE crawlers - are simply there to gather data for human consumption. I'm mot sure, however, if I'm in a position to decide which of that activity is meaningful or not (is it meaningful to collect results for a SERP pages that never gets viewed? is collecting that info not intrinsically tied into Googlebot programming?)

Thoughts on this exchange

I'm partially convinced.

Zeifman says that feed fetchers "[...] are not significantly different from any other bot that runs tasks on demand. (e.g., some marketing tools that will scan on demand [...]"

In fact, this is backed up by their definition in a previous blog post:

A bot , also known as Internet bot, is a program that runs automated tasks over the Internet. Typically intended to perform simple and repetitive tasks, Internet bots are scripts and programs that enables their user to do things quickly and on a scale. For example, search engines like Google, Bing, Yandex or Baidu use crawler bots to periodically collect information from hundreds of millions of domains and index it into their result pages.

They qualify bots as internet bots. This is important because it distinguishes a "program that runs automated tasks over the Internet", from another class of programs which just run automated tasks.

By this definition rendering a batch of videos, for example, would be considered the work of a bot if they were rendered "over the internet". Preiemer Pro or Final Cut do not primarily use internet communication and so this disqualifies them. They might for example send analytics data over an internet connection if available. But there are other services which do transcoding, such as Convert.co and many others, which I presume would be considered bots by Ziefman.

Back to my first objection, the Facebook feed fetcher does in fact satisfy this. But then does a packet router? (note: I mean a packet router which sends the basic tiny messages of the Internet, see here if you are unfamiliar with the term) Would we consider them to be bots? The answer is in this comment by Ziefman:

[...] on the topic of behavior, bots interact with websites in a way which is often similar to humans. (e.g., parsing HTML and even CSS and JS, requesting images and etc) In the end, most bots [...] are simply there to gather data for human consumption."

This is how we exclude packet routers. Their work is too low level. This helps us a little in our search of a definition, but it's very broad. That's why he can confidently imply that bots perform the task of getting the "[...] content Twitter fetches automatically when you log in".

Robot - namesake of the bot

Let's get back to basics, and think about how (Internet) bots are related to robots, their forebears.

If you look for an example dictionary definition of a robot you will see that the common, basic feature is automation. But that's not really enough common ground, as we will see.

I think that robotics is afflicted with the same discrepancy as bots: when is a machine a robot? It's a similar question to when is a program a bot? Let's take a brief look at that to see if it sheds any light on the bot definition problem.

There is an interesting article asking "What is a robot?". It deals with the overlap of machine and robot, and makes some good points. I'll reproduce several paragraphs here:

After [Karel] Capek [a Czech sci-fi writer] brought "robot" into the lexicon, it quickly became a metaphor for explaining how various technologies worked. By the late 1920s, just about any machine that replaced a human job with automation or remote control was referred to as a robot. Automatic cigarette dispensers were called "robot salesmen," a sensor that could signal when a traffic light should change was a "robot traffic director," [...]

Today, people talk about robots in similarly broad fashion. Just as "robot" was used as a metaphor to describe a vast array of automation in the material world, it’s now often used to describe—wrongly, many roboticists told me—various automated tasks in computing [emphasis mine]. The web is crawling with robots programmed to perform tasks online, including chatbots, scraper bots, shopbots, and twitter bots. But those are bots, not robots. And there’s a difference.

"I don’t think there’s a formal definition that everyone agrees on," said Kate Darling, who studies robot ethics at MIT Media Lab. "For me, I really view robots as embodied. For me, algorithms are bots and not robots."

"What’s interesting about the spectrum of bots, is many of the bots have no rendering [physical body] at all," said Rob High, the chief technology officer of Watson at IBM. "They simply sit behind some other interface. Maybe my interface is the tweet interface and the presence of the bot is entirely math—it’s back there in the ether somewhere, but it doesn’t have any embodiment."

For a robot to be a robot, many roboticists agree, it has to have a body.

Robots in body, bots in mind

When we perceive a machine that in some way resembles us, be it in decision making capacity, physically, and especially by talking, we say it's a (ro)bot, and with it conjure up the corpus of popular imagery we've learned and love.

Bot or not?

Bouncing on from something else in the above article, there is a great podcast called Robot or Not?. In micro, byte-sized 😉 broadcasts, "John Siracusa educates Jason Snell about what is, and what is not, a robot.". It's a kind of wandering walk through any and pop culture robot you can imagine, and awesome at that 😎

The author of the above article referred to this show in their search for the definition of a robot. I did my own listening to all of the nearly 100 episodes (I highly recommend it) and found that they specifically address Chatbots. Here's an excerpt, edited where possible to shorten it:

Jason Snell: People suggest dumb things sometimes for this podcast and this is one of them. [...] People wanted to ask about like AIM bots and IRC bots [...] These are not really robots, they're software [...]

John Siracusa: They're not robots unless we're all in the Matrix and don't know we're in the Matrix [...] I think the more interesting question is why do people call these things bots at all. Why is it called an IRC bot. Why, you know, AOL's messenger bot or whatever, Slackbot -

Jason Snell: - people love robots! The kids, they love their robots!

John Siracusa: - it's because when you write a program that exhibits any sort of like mildly, not just autonomous behavior, but like "it thinks it's people" kind of behaviour, where you will say something and Slackbot will reply with a snarky reply, [...] And it's a software program that does it and so they give the name "bot" because it is different from, ah I dunno, a program that you just run that downloads a file or that like will spider a site and look for all the different files and download them all, right, that's not called a bot.

John Siracusa: A bot is a thing that interacts with us a tiny little bit like a human would, like it's using the same interface as a human [...] and so if we make a program that does that we call it bots like "Oh! It's a like a person but not really, it's a dummy silly mechanical type person, but it's fun". And so that get's to the heart of what real robots are, but none of those things of course are actually robots, they're just taking the name because, in the same way that we graduate from machine, like a toaster oven, to robot once it starts having a certain level of - again, I don't want to say autonomy - but a certain level of interacting with the world, even a little bit like living things or humans do, then we start to think it's a robot, and the same thing with programs.

Jason Snell: But they're not robots.

John Siracusa: Nope.

Interfacing like a human

Very interestingly, this John does not include website spiders (famously used by Google to "crawl" the Internet to build their search database) in his definition of even a bot. This is because they do not exhibit "it thinks it's people" kind of behaviour. He extends this slightly, beyond how the bot thinks to what it potentially does:

A bot is a thing that interacts with us a tiny little bit like a human would, like it's using the same interface as a human [...]

This is interesting in the context of Steemit, because it's precisely what the so-called bots here do, and in fact it would be impossible to distinguish bot up votes or comments from human ones, except for markers in the content or other contextual giveaways, like voting faster, in bulk or more spread out than a human ever would. (Actually writing this part led me to suggest this a small single purpose web app idea to look for these markers, just for fun! If you'd like to see it get made, comment! 🤓).

It's also interesting because it does actually allow for website spiders or crawlers to be permitted the title "bot", as they read a website using the exact same tools available to people, i.e. the interface, in contradiction with that they said.

Siracusa touches on the hierarchy of automatons leading to robots, in what he refers to as "graduations". Let's look into that further and see if it applies to bots as well as robots.

A hierarchical taxonomy

Another great discussion which cuts to the core of the definition, though a little theoretical, was this posters' answer on English StackExchange (for the language, not the people of England 😂):

As well as automata, robots ARE machines, because they are systems that have been invented by humans.

Also, robots ARE automata as well, because they are, basically, automatic machines.

[...]

Whether the robots be autonomous or not (remote controlled for instance, or supervised, for virtual robots), we do not say to robots: "switch from this state to this state". What we say to robots is much more something like "do that". And from here, they are self-deciding, whereas automata are just self-acting. We don't say to robots "how to do things". We say them "what to do". This is where "artificial intelligence" comes into the game. We tell them what to do, and they are sophisticated enough to "decide" by themselves how to do actually the things, they choose the best way of changing their internal states regarding the context.

Note that the term itself robot gives a good idea of that. It comes from the Czech robotnik which means slave. Because we dont pilot them step-by-step, we just say them what to do.

[...]

Finally, as well as machines and automata, a robot may be non-physical. In the case of robots, we do not say theoretical nor abstract, though, since they always are applied to concrete cases, so we say virtual. Chatterbots, or Web Crawler are virtual robots, also called bots.

(source)

(ro)Bots don't just do, they decide

This is an amazing line from that post and I just want to reiterate it:

And from here, they [robots] are self-deciding, whereas automata are just self-acting.

Self-deciding is the key according to this poster. I really like this as a test of robot-hood and I think it applies to bots too. Bots made decisions, even if small. It's still kind of vague, but I think we can agree that a feed fetcher is as dumb as you can get and doesn't make any decisions. Nor to movie transcoders, etc. Advanced voting bots however do made decisions, according to strict programming but decisions none the less.

Non-physical or non-robot?

None the less, it's clear that there's is some gray area here. The paragraphs above are extrapolating robot behavior onto bots. What the poster really considers bots to be are virtual robots. Siracusa would consider virtual robots to be robots to persons completely within the virtual system only, and not to outside observers. In the same Robot or Not? episode quoted above, Siracusa touches on this, here's the full quote which was shortened before:

John Siracusa: [Bots are] not robots unless we're all in the Matrix and don't know we're in the Matrix and the things that we're calling robots are like the robots we're talking about on the show, where they're like machines that wander around doing things. [...]

They actually end up doing a full episode about The Matrix where they consider it in more detail, ultimately coming back to the bottom line that the lack of physical embodiment (and central control for some reason) disqualifies the agents and other programs such as the Architect, etc, from robot-hood.

Jason Snell: [...] So Mr. Smith, Agent Smith and all the other agents in The Matrix, [...] are they robots? People who are living there in The Matrix, they're in a virtual world, these are... It's not entirely clear are they part of the operating system of The Matrix, are the independent entities in The Matrix? They're artificial lifeforms in that way but they robots?

John Siracusa: No, it's like a video game, they're just characters in a video game. I mean you can have intelligence, they could be centrally controlled, they can have little autonomous agents, they're all referred to as programs but the bottom line is what they're made out of is nothing. What they're made out of is just a bunch of data that is interpreted by each of the minds that is in The Matrix, there's no, there's nothing there-there. What's actually happen is a bunch of people sitting in pods - spoilers - and a big computer somewhere, and electrical signals and synapses firing, right? [...]

But since The Matrix is kind of like the Internet (kind of, come on, help me out here 😌 🙏) the agents and other programs are really bots! Probably, maybe I've gone too far 😅

To try and make both the English StackExchange poster and John Siracusa's ideas coherent, we could say that virtual robots are not robots in the traditional sense, but that they are bots.

Bots are virtual robots

And thus we have arrived a slightly vague but well considered definition. Bots are virtual robots. Maybe that was obvious to you from the start 😅 If so, gold star for you! 💫 ⭐️

For me, one of the best things that was highlighted in this discussion was that for something to be considered a bot, it needs to do something that a person would, even if the bot does it much much better or faster. This applies in the same ways as it does to robots. A bot has to chiefly do something which replaces "normal" human work. I now realise that it is this point that always made it hard for me to consider Facebook's feed fetcher a bot, no person would really fetch feeds. However this is really subjective, perhaps you would consider it normal human work, like a telephone operator at a switchboard in times of old.

Still, realising the subjectivity and interpretive nature of the label bot or robot has been the major outcome. I believe this is why voter bots on Steemit really fit the definition well. They literally replace a task you do in a way which is indistinguishable from the way you would do it as far as the interface is concerned, but of course cannot actually be how you really vote.

A last word on the report

Coming back to the Incapsula report, that they say that most Internet traffic is generated by bots no longer wows, concerns or overly interests me, as it did when I first read it. If you can consider the processes that Twitter initiates when you log in to be bots, well they are already integrated into the very clients and software themselves and no longer should have any kind of special meaning, unless particularly smart, like in the case of advanced AI chatbots or amazing Steemit voting bots 😉 The term "bot" as used there is just another uninteresting name for a fundamental software component.

This matters because what tickles our interest and imagination about bots and robots is their similarity to us. By the Incapsula definition, this no longer is the case.

Afterword: virtual robots in the age of gameified life

John Siracusa says of the "programs", AKA bots, in The Matrix that they're not robots, that "it's like a video game, they're just characters in a video game."

There is a strange interaction between social networks, games and these kind of bots. Charlie Brooker, an English satirical film maker and creator of popular TV show Black Mirror (watch it if you haven't already), once made a tongue-in-cheek argument for considering Twitter a MMORPG, or a massively multiplayer online role playing game. Basically like World of Warcraft and many others, but with words and hashtags instead of spells, 3D animation.

Charlie Brooker: Twitter is a massively multipler online game in which you choose an interesting avatar and then roleplay a persona who is loosely based on your own, attempting to recrute followers by repeatedly pressing lettered buttons to form interesting sentances.

This is interesting as our avatars on Twitter, and of course of Steemit, are kind of like characters we create, based loosely on ourselves. For some it's based very loosely, or with the details not much fleshed out (as in the case of myself 😳).

Robert Florence (Twitter user): What I do on Twitter a lot is just project a false persona. And it's like that avatar thing, it's like World of Warcraft or anything like that. The way I am on Twitter is nothing like the way I am in real life. That feels like a game sometimes, if you're a sociopath, it feels like a game.

In computer games, autonomous characters are often called NPCs (non-player characters) but the term bot would just as easily apply here, especially considering Siracusa's point above, comparing Matrix programs to video-game characters, implicitly NPCs.

Virtual robots AKA bots are basically virtual players, if the system is a game. Much of our online lives are gameified, with posts, awards and badges, so this comparison is holds to a certain degree. Some people say bots "game" the system. With the above in mind, this statement takes on a new, literal sense.

Thanks for reading!

In the next part I intend to investigate "the bot problem", if there is one. This will touch on how bots work with Steemit and the Steem blockchain, possible scenarios which could "fix" the problem and why most of them are problematic themselves.

As always, thanks for reading! 😁 🙌 My focus has shifted away a bit from privacy and data protect, towards bots and the current buzz about them. I'm going with the flow for the moment. 😋 🐟

Super secret diff Easter Egg 😆

[...]

Sort:  

Steemit is not just bots.. its a few devils too.

You know it 👹

Coin Marketplace

STEEM 0.19
TRX 0.15
JST 0.029
BTC 62702.02
ETH 2572.25
USDT 1.00
SBD 2.75