Reinforcement learning in a nutshell

in #science6 years ago

Hello Steemians, I cannot believe that it is Thursday already! I hope that you guys are all doing well!
Sorry I was not able to produce any content yesterday, I got so busy yesterday trying to debug my code that by the time I decided to take a break from it it was already so late.

Today, we are going to talk about a famous artificial intelligence (AI) branch, which is known as Reinforcement learning. This will be a very brief introduction to what is known today as a really huge branch in the Artificial Intelligence community. We will be able to expand a little more on it at least every 2 weeks. So, let’s go ahead and start :) !

1- The three dominant branches of AI these days.

In the AI world, scientists and developers are working on three major branches: Supervised learning, Unsupervised learning and Reinforcement learning.

Supervised learning basically means that you give a bunch of data with their corresponding labels to the AI algorithm, (e.g. you are being given a picture of a cat and a label saying that it is a cat) and you expect the algorithm to fit a function through those data that can be generalized enough so that it can be used to label unseen data.

cat-dog-flow-horizontal.gif

Unsupervised learning is different in the sense that you give some unlabeled data to your AI algorithm (e.g. you are being given a picture of cats with no labels), and through rigorous training you expect the algorithm to find a pattern through those data, generate similar data, denoise them (etc… ).

graph_machine_learning.png

Reinforcement learning is much different from those because you let the AI algorithm learn by itself by interacting in a virtual environment and learn with the constant feedback it will be getting. Let’s dig more into this concept shall we?

2- Reinforcement learning in a nutshell.

When a toddler is trying to figure out how to walk on his own, he interacts with an environment (the house, the playground etc …). For every action that he takes, he gets a constant feedback. What I mean by feedback, is that he can get either positive rewards (parents praising him, he can realize that he is not falling, receiving some cookies etc …) or negative rewards (he falls, he gets hurts, etc …). He knows he is getting those rewards because of his biological senses and eventually with trial and error he knows which actions he has to take to get positive rewards and walk properly.

450px-Learning_to_walk_by_pushing_wheeled_toy.jpg

Now that I have explained that concept, I can tell you that you’ve just grasped the very core concept of Reinforcement learning! It works exactly the same way. The AI algorithm interacts with a virtual environment via an agent and it has a list of actions that he can possibly take that are defined in what is called a Policy. There are many advantages in using a virtual environment, one of them being that the agent can try and fail as much time as possible without any physical repercussion, just like in a game! A virtual environment can be anything, such as a simulation game environment, a self-driving car environment, a video game environment etc ...

Rl_agent.png

One of the most popular software libraries that is being used these days to generate environments is called “gym”, that was produced by OpenAI, the non-profit company that was partially funded by Elon Musk. They have a bunch of game environments such as Atari, Aladin etc ...

musk.jpg

The agent is what allows the AI algorithm to be aware of its environments and the consequences of his actions. The agent is also the means by which the AI algorithm decides to take an action in an environment. For instance, when you play a 2D video game in a maze, you can take either of the following actions: (go up, go down, go right or go left). Each of those action will result in a sort of reward, either positive or negative, that is defined by the Policy. You can see the Policy as a sort of written contract that defines what kind of rewards should be given to an agent based on his action. Makes sense right? :)

Reinforcement_learning_diagram.svg.png

Ok, that was it for today regarding Reinforcement learning. Remember that I only talked about very tiny basics of it so that it did not become too technical. We will of course talk more in depth about it at least once every two weeks. Please let me know if you think that I was either talking too generally or on the other hand too technical, as that would really help me to produce better AI contents for all of you guys :)! I highly recommend to read Dr. Richard S. Sutton and Andrew G. Bato's book if you are interested in the subject. The book's name is Reinforcement Learning: An Introduction. Tomorrow, we will talk about the video game Assassin's Creed I(XBOX360).

Hope that you guys have a great end of the day/night depending on where in the world you are! :)

Pictures

Picture 1 was taken from https://goo.gl/vy3YxP

Picture 2 was taken from https://goo.gl/aDxpcW

Picture 3 was taken from https://goo.gl/LZpwwN

Picture 4 https://goo.gl/BnGj3Y

Picture 5 was taken from https://goo.gl/vCgMMa

Picture 6 was taken from https://goo.gl/rV5Xo4

juv79505 sincerely thank you for reading this article. Please feel free to comment below in case you have ideas, questions, suggestions or simply want to criticize this article. Also, note that all pictures used in this article were extracted from the google section “pictures labeled for commercial reuse”. Stay tuned for more articles on health, environment, artificial intelligence, video games, technology in general, books, graphic novels, geography, history, sports and much more!

#ocd-resteem

Sort:  

As a computer engineering student I find this exciting! The technology still has a ways to go but it is improving all the time!

Thanks for reading and commenting my post :)! Looks like we will have a lot to talk about, I just followed you! Yeah I study AI and I find all those algorithm fascinating! I think I am more interested in Reinforcement learning because of the gaming aspect of it.

By the way OpenAI Bots (with their newly trained RL algorithm) will be playing against some very good OpenAI players on August 5th on Twitch! This is in preparation of the DOTA 2 Internationals. You should watch it if you are interested, this is huge! Link below

https://twitter.com/OpenAI/status/1019619485098000384

Congratulations! This post has been upvoted from the communal account, @minnowsupport, by juv79505 from the Minnow Support Project. It's a witness project run by aggroed, ausbitbank, teamsteem, theprophet0, someguy123, neoxian, followbtcnews, and netuoso. The goal is to help Steemit grow by supporting Minnows. Please find us at the Peace, Abundance, and Liberty Network (PALnet) Discord Channel. It's a completely public and open space to all members of the Steemit community who voluntarily choose to be there.

If you would like to delegate to the Minnow Support Project you can do so by clicking on the following links: 50SP, 100SP, 250SP, 500SP, 1000SP, 5000SP.
Be sure to leave at least 50SP undelegated on your account.

Resteemed your article. This article was resteemed because you are part of the New Steemians project. You can learn more about it here: https://steemit.com/introduceyourself/@gaman/new-steemians-project-launch

Coin Marketplace

STEEM 0.20
TRX 0.14
JST 0.030
BTC 68148.22
ETH 3249.65
USDT 1.00
SBD 2.67