Build and Train General AI in OpenAI Gym: Initial Setup

in ai •  2 years ago

The OpenAI Universe project seeks to decentralized artificial intelligence by allowing anybody to build bots that ideally can learn to play thousands of video games well. Games ranging from classic Atari, Minecraft to Grand Theft Auto V are available in this environment, but the actual building of a bot (or agent) to act in them is left to us. This is where OpenAI Gym comes in to play.

Instead of having to make a bot for each individual game (i.e. task-specific AI,) the goal here is to make one bot to play many games (i.e. general AI) well without knowing anything about it at the start. All we are going to have available is the ability to view the screen (visual recognition) and what controls are available such as an Atari paddle (moving left/right) or joystick (left/right/up/down and button.)

When I first read up on this, it was stated that you only needed 9 lines of Python code (which is true) to get started! That sounded too awesome to not give it a try!

However getting the environment setup to be able to run these 9 lines of code successfully was another matter. Having mostly been a Windows user, this ended up taking me a little longer than I expected, with the project only supporting Linux and OSX (Apple's operating system.)

But as of yesterday I have a running version. While the image below may not look terribly exciting, it took me some time to get an Atari game to run and render. From here I can finally start the fun part of working on the actual intelligence of the bots to play these games.

Reinforcement Learning

OpenAI Universe is what sets up the environments that our bots are going to play within. This can be any of the thousands of games they already have available and can even have multiple going at the same time.
OpenAI Gym is where we are able to build and put our game bots (or agents) to then be put into the game universe. From here our agents are able to read the game screen and try to maximize their "score." This is how we are able to close the loop for reinforcement learning.

Using the example of a Super Mario game, a simple game bot (agent) can be to just press right, getting a higher score the farther to the right you go. As we all know, there lot's of ways we can be killed (ending the game) such as getting hit by a monster, falling down a hole, etc. So just pressing right will only get us so far, over time we need to find a way to adapt our instruction of only pressing right to include things like jumping, shooting fireballs, etc. This is where we will end up making use of reinforcement learning.

The game world is loaded up by OpenAI Universe (the Environment,) the game bot is loaded with OpenAI Gym (the agent) and over time we will refine our actions to get the highest score (reward) possible by finishing the level.
Follow up posts will include what I do with the actual bot training with this one solely discussing the setup.

How to Setup Open AI Universe and Gym

[NOTE: There are many ways you can get OpenAI running. This was just MY successful method.]

For those who run Windows to get a linux environment download VirtualBox and install it. Next you'll need to download the linux (in my case Ubuntu 16.04) system that we'll be using from here. This file (.iso) is about 1.5 GB so it may take a little while depending on your internet connection. Make sure to save this somewhere you can easily find it, such as your desktop.

Now we'll create your first virtual machine running Ubuntu by following these steps. Make sure to do the last step of removing the installation file (the .iso you downloaded) from the "optical drive." Otherwise when you restart your virtual machine it will try to start the installation process all over again. [If you follow the instructions, say YES to force unmount when prompted.]

From here I followed this guide which uses Anaconda that includes many of the libraries we are going to need such as Scipy and Numpy. Make sure to follow the "For Ubuntu 16.04" instructions, not the 14.04 ones. Everything in this guide worked well for me, except for the Tensorflow 11.0 section. [Pip didn't work for me with Tensorflow]

If you get messages that the Tensorflow wheel isn't compantible/available, just goto here to follow the anaconda installation which has you run the commands:
conda create -n tensorflow python=3.5

$ source activate tensorflow
(tensorflow)$ # Your prompt should change

# Linux/Mac OS X, Python 2.7/3.4/3.5, CPU only:
(tensorflow)$ conda install -c conda-forge tensorflow

After you have Tensorflow installed you can go back to this Guide and pick back up at the heading of "Next we can get started by installing Docker:".

This will walk you through the setup of Docker, OpenAI Universe and OpenAI Gym.
At the end it will have you install a started agent for playing Pong. To get the MsPacman game running that I had in my screenshot, just save this code into a python file (I used

import gym
env = gym.make('MsPacman-v0')
for _ in range(1000):
    env.step(env.action_space.sample()) # take a random action

This code came from here.

Additional Resource:
OpenAI Universe Documentation
Open AI Gym Documenation
OpenAI – Universe Installation Guide Ubuntu 16.04
Tensorflow Guide

@winstonwolfe's Crowdsourced Steemit Video

Are you new to Steemit and Looking for Answers? - Try

Image Sources:
MultiGame Panel
[Screenshot is from me]
Reinforcement Learning

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

nice post


Whoever or WHATEVER can predict the future , controls the world.
Steemit Blockchain Will Be The Central Cortex Of The Global Quantum A.I.'s Brain - The Sentient World Simulaion, for predicting the future.

All the quantum computers of all the worlds biggests corporation are interlinked and working to this end.


Any one with even half a brain knows that blockchain technology is not going away. It's here to stay!
Why? Because EVERYTHING... every piece of data and every piece of edge computing hardware is soon to be tied (chained) into the blockchain matrix in the cloud. Including EVERY product, service, creative and recreational activity that we do.


Using 5g technology and cheap telemetric printed circuit tattoos (like barcodes on steroids) able to sense, process and transmit data, will be in EVERYTHING and EVERYONE.
We will all become part of the Sentient World Simulation (SWS) as described in the popular Purdue University white paper on this subject, It's already up & running right under your noses as you read this blog post. IT'S WATCHING YOU!!!

We each have a Global Quantum A.I. avatar following and recording our every move using our 'smart' phones, 'smart' appliances - TV's, game consoles, 'smart' fridges, 'smart' meters & 'smart' cookers, etc.
We each have our own virtual world avatar that simulates & mimics our character, in order to predict the future.
Whoever or WHATEVER can predict the future , controls the world... The Global Quantum A.I.


That's cool!


Thank you! I'll show how the actual bot building and training goes in future posts.

Great post @sykochica, we need more programming explanations like this!


Thank you! I plan to get some more programming posts out, even if not all with openAI specifically, but at least something AI or game related.