Introduction to Tensorflow as a Computational Framework Part 2
This is part two of a multi part series. If you haven't already, you should part one first where we discussed the design philosophy of Tensorflow.
This time we will look at doing computations with Tensorflow.
This post originally appeared on kasperfred.com where I write more about machine learning.
Basic computation example
Knowing how variables work, we can now look at how to create more complex interactions.
A graph in Tensorflow consists of interconnected operations (ops). An op is essentially a function that is anything that takes some input and produces some output, and as we discussed before, the default datatype of Tensorflow is the tensor, so operations can be said to be doing tensor manipulations.
Taking a look at a very basic example, multiplying two scalars, it can be done like so:
a = tf.Variable(3) b = tf.Variable(4) c = tf.multiply(a,b) print (c)
Tensor("Mul:0", shape=(), dtype=int32)
print (a) print (b)
<tf.Variable 'Variable_4:0' shape=() dtype=int32_ref> <tf.Variable 'Variable_5:0' shape=() dtype=int32_ref>
Note that when we print the result we get another Tensor, and not the actual result. Also, notice that the variables have the shape
() which is because a scalar is a zero dimensional tensor. Finally, because we didn't specify a name, we get the names
'Variable_5:0' which means they are variable 4 and 5 on graph 0.
To get the actual result, we have to compute the value in the context of a session. This can be done like so:
with tf.Session() as sess: sess.run(tf.global_variables_initializer()) # this is important print (sess.run(c))
You can also use
tf.InteractiveSession which is useful if you're using something like IDLE or a jupyter notebook. Furthermore, it's also possible to start a session by declaring
sess = tf.Session(), and then close it by using
sess.close(), however, I do not recommend this practice as it's easy to forget to close the session, and using this method as an interactive session may have performance implications as Tensorflow really likes to eat as many resources as it can get its hands on (it's a bit like Chrome in this regard).
We start by creating a session which signals to Tensorflow that we want to start doing actual computations. Behind the scenes, Tensorflow does a few things; it chooses a device to perform the computations on (by default your first CPU), and it initializes the computational graph. While you can use multiple graph, it's generally recommended to use just one because data cannot be sent between two graphs without having to go through Python (which we established is slow). This holds true even if you have multiple disconnected parts.
Next we initialize the variables. Why you cannot do this while starting a session I don't know, but it fills in the values of our variables in the graph, so we can use it in our computation. This is one of these small annoyances which you have to remember every time you want to compute something.
It might help to remember that Tensorflow is really lazy, and wants to do as little as possible. As an implication of this, you will have to explicitly tell Tensorflow to initialize the variables.
Tensorflow is lazy
It might be useful to explore this in a bit more detail as it's really important to understand how and why this was chosen in order to use Tensorflow effectively.
Tensorflow likes to defer computation for as long as possible. It does so because Python is slow, so it wants to run the computation outside Python. Normally, we use libraries such as numpy to accomplish this, but transferring data between Python and optimized libraries such as numpy is expensive.
Tensorflow gets around this by first defining a graph using Python without doing any computation, and then it sends all the data to the graph outside Python where it can be run using efficient GPU libraries (CUDA). This way, the time spent on transferring data is kept at a minimum.
As a result of this, Tensorflow only has to compute the part of the graph you actually need. It does this by propagating back through the network when you run an operation to discover all the dependencies the computation relies on, and only computes those. It ignores the rest of the network.
Consider the code below for example:
a = tf.Variable(3) b = tf.Variable(4) c = tf.multiply(a,b) d = tf.add(a,c) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) c_value = sess.run(c) d_value = sess.run(d) print (c_value, d_value)
Here, we have two primitive values, a, and b, and two composite values, c, and d.
- c relies on a, and b.
- d relies on a, and c.
So what happens when we compute the values of the composites?
If we start with the simplest, c, we see that it relies on the primitive values, a, and b, so when computing, c, Tensorflow discovers this through the backpropagation (which is not the same as backpropagation through a neural network), gets the value of these primitives and multiplies them together.
The value of d is computed in a similar fashion. Tensorflow finds that d is an additions operation that relies on the value of a, and c, so Tensorflow gets the value of each of them. For the value a, all is great, and Tensorflow is able to use the primitive value as is, but with the value c, Tensorflow discovers that it itself is a composite value, here a multiply operation that relies on a, and b. Tensorflow now gets the value of a, and b which it uses to compute the value of c, so it can compute the value of d.
Tensorflow recursively computes the dependencies of an operation to find its computed value.
However, this also means that values are discarded once computed, and can therefore not be used to speed up future computations. Using the example above, this means that the value of c is recalculated when computing the value of d even though we just computed c and it hasn't changed since then.
Below, this concept is explored further. We see that while the result of
c is immediately discarded after being computed, you can save the result into a variable (here
res), and when you do that, you can even access the result after the session is closed.
a = tf.Variable(3) b = tf.Variable(4) c = tf.multiply(a,b) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) res = sess.run(c) print (res,c)
12 Tensor("Mul:0", shape=(), dtype=int32)
with tf.Session() as sess: sess.run(tf.global_variables_initializer()) res = sess.run(c) print (res,c)
12 Tensor("Mul:0", shape=(), dtype=int32)
Come back tomorrow when we will look at doing computations at a scale (GPUs and multiple computers) as well as how we can save our results.
Read part 3 now.