Computations at a scale - Introduction to Tensorflow Part 3

kasperfred (58)in #tensorflow • 7 years ago (edited)

This is part three of a multi part series. If you haven't already, you should read the previous parts first.

Part 1 where we discussed the design philosophy of Tensorflow.
Part 2 where we discussed how to do basic computations with Tensorflow.

This time, we will look at doing computations at a scale as well as how we can save our results.

This post originally appeared on kasperfred.com where I write more about machine learning.

Choosing devices

You can choose to compute some operations on a specific device using template below:

with tf.device("/gpu:0"):
    # do stuff with GPU

with tf.device("/cpu:0"):
    # do some other stuff with CPU

Where the string "/gpu:0", and "/cpu:0" can be replaced with any of the available device name strings you found when verifying that Tensorflow was correctly installed.

If you installed the GPU version, Tensorflow will automatically try and run the graph on the GPU without you having to explicitly define it.

If a GPU is available it will be prioritized over the CPU.

When using multiple devices, it's worth considering that switching between devices is rather slow because all the data has to be copied over to the memory of the new device.

Distributed computing

For when one computer simply isn't enough.

Tensorflow allows for distributed computing. I imagine that this will not be relevant for most of us, so feel free to skip this section as you please, however, if you believe you might use multiple computers to work on a problem, this section might have some value to you.

Tensorflow's distributed model can be broken down into several two parts:

Server
Cluster

These are analogous to a server/client model. While the server contains the master copy, the clusters contain a set of jobs that each have a set of tasks which are actual computations.

A server that manages a cluster with one job and two workers sharing the load between two tasks can be created like so:

cluster = tf.train.ClusterSpec({"my_job": ["worker1.ip:2222", "worker2.ip:2222"]})
server = tf.train.Server(cluster, job_name="my_job", task_index=1)

a = tf.Variable(5)

with tf.device("/job:my_job/task:0"):
    b = tf.multiply(a, 10)

with tf.device("/job:my_job/task:1"):
    c = tf.add(b, a)

with tf.Session("grpc://localhost:2222") as sess:
    res = sess.run(c)
    print(res)

A corresponding worker-client can be created like so:

# Get task number from command line
import sys
task_number = int(sys.argv[1])

import tensorflow as tf

cluster = tf.train.ClusterSpec({"my_job": ["worker1.ip:2222", "worker2.ip:2222"]})
server = tf.train.Server(cluster, job_name="my_job", task_index=task_number)

print("Worker #{}".format(task_number))

server.start()
server.join()

If the client code is saved to a file, you can start the workers by typing into a terminal:

python filename.py 0

python filename.py 1

This will start two workers that listen for task 0 and task 1 of the my_job job.
Once the server is startedk, it will send the tasks to the workers which will return the answers to the server.

For a more in-depth look at distributed computing with Tensorflow, please refer to the documentation.

Saving variables (model)

Having to throw out the hard learned parameters after they have been computed isn't much fun.

Luckily, saving a model in Tensorflow quite simple using the saver object as illustrated in the example below:

a = tf.Variable(5)
b = tf.Variable(4, name="my_variable")

# set the value of a to 3
op = tf.assign(a, 3) 

# create saver object
saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    sess.run(op)

    print ("a:", sess.run(a))
    print ("my_variable:", sess.run(b))

    # use saver object to save variables
    # within the context of the current session 
    saver.save(sess, "/tmp/my_model.ckpt")

a: 3
my_variable: 4

Loading variables (model)

As with saving the model, loading a model from a file is also simple.

Note: If you have specified a Tensorflow name, you must use that same name in your loader as it has higher priority than the Python name. If you haven't specified a Tensorflow name, the Variable is saved using the Python name.

# Only necessary if you use IDLE or a jupyter notebook
tf.reset_default_graph()

# make a dummy variable
# the value is arbitrary, here just zero
# but the shape must the the same as in the saved model
a = tf.Variable(0)
c = tf.Variable(0, name="my_variable")

saver = tf.train.Saver()

with tf.Session() as sess:

    # use saver object to load variables from the saved model
    saver.restore(sess, "/tmp/my_model.ckpt")

    print ("a:", sess.run(a))
    print ("my_variable:", sess.run(c))

INFO:tensorflow:Restoring parameters from /tmp/my_model.ckpt
a: 3
my_variable: 4

Come back tomorrow for part 4 where we will look at doing visualizations with Tensorflow.

Read part 4 here now.

#steemdev #machine-learning #programming #ai

7 years ago in #tensorflow by kasperfred (58)

Sort:

deepverma (48) 7 years ago

Nice blog 👌👌👌👌

$0.00

1 vote

[-]

kasperfred (58) 7 years ago

Thanks.

$0.00

[-]

steemitrobot (-2)(1) 7 years ago

This post was resteemed by @steemitrobot!
Good Luck!

Resteem your post just send 0.100 SBD or Steem with your post url on memo. We have over 2000 followers. Take our service to reach more People.

Pro Plan: just send 1 SBD or Steem with your post url on memo we will resteem your post and send 10 upvotes from our Associate Accounts.

The @steemitrobot users are a small but growing community.
Check out the other resteemed posts in steemitrobot's feed.
Some of them are truly great. Please upvote this comment for helping me grow.

$0.00

STEEM 0.18

TRX 0.13

JST 0.028

BTC 58080.30

ETH 3102.16

USDT 1.00

SBD 2.40

Computations at a scale - Introduction to Tensorflow Part 3

Choosing devices

Distributed computing

Saving variables (model)

Loading variables (model)

Coin Marketplace