Tensorflow
A tensor is a generalization of an array or a matrix to arbitrary dimensions.
- 0: scalar
- 1: vector
- 2: matrix
- 3+: ...
- 0 - any: tensor
Placeholder
The common usage for TensorFlow programs is to first create a graph and then launch it in a session.
placeholder -- a value that we'll input when we ask TensorFlow to run a computation.
e.g. a place holder for a 2-d tensor, which can have any number of rows, each row is a 784 long vector.
x = tf.placeholder(tf.float32, shape=[None, 784])
Variable
784 input and 10 output
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
- placeholder: user input
- variable: the program updates
>>> a = tf.constant(10)
>>> b = tf.constant(20)
>>> y = tf.mul(a,b)
>>> y
<tf.Tensor 'Mul:0' shape=() dtype=int32>
>>> with tf.Session() as sess:
... sess.run(y)
...
200
>>> hello = tf.constant('Hello')
>>> sess = tf.Session()
>>> sess.run(hello)
b'Hello'
>>> import tensorflow as tf
>>> a = tf.constant(2)
>>> b = tf.constant(3)
>>> sess = tf.Session()
>>> sess.run(a + b)
5
>>> a = tf.placeholder(tf.int16)
>>> b = tf.placeholder(tf.int16)
>>> sess.run(tf.add(a,b), feed_dict={a: 2, b: 3})
5
>>> matrix1 = tf.constant([[3., 3.]])
>>> matrix2 = tf.constant([[2.],[2.]])
>>> product = tf.matmul(matrix1, matrix2)
>>> sess.run(product)
array([[ 12.]], dtype=float32)
>>> v1 = tf.constant([1, 2, 3])
>>> v2 = tf.constant([4, 5, 6])
>>> sess.run(v1 * v2)
array([ 4, 10, 18], dtype=int32)
>>> sess.run(tf.log([0., 1., 2., 3.]))
array([ -inf, 0. , 0.69314718, 1.09861231], dtype=float32)
>>> sess.run(tf.reduce_sum([[1, 2, 3], [4, 5, 6]]))
21
>>> sess.run(tf.reduce_sum([[1, 2, 3], [4, 5, 6]], 0))
array([5, 7, 9], dtype=int32)
>>> sess.run(tf.reduce_sum([[1, 2, 3], [4, 5, 6]], 1))
array([ 6, 15], dtype=int32)
op: operation, e.g. tf.constant()
is an op
, tf.matmul()
is another op
computation graph: each node is an op
. Just a definition, or blueprint.
Session: a graph has to be launched in a session. Sessions should be closed to release resources.
Tensor:
Short definition: an n-dimensional array
Official definition:
TensorFlow programs use a tensor data structure to represent all data -- only tensors are passed between operations in the computation graph. You can think of a TensorFlow tensor as an n-dimensional array or list. A tensor has a static type, a rank, and a shape.
A placeholder() operation generates an error if you do not supply a feed for it.
- tensor: a tensor (an n-dimensional array)
A Tensor object is a symbolic handle to the result of an operation, but does not actually hold the values of the operation's output. Instead, TensorFlow encourages users to build up complicated expressions (such as entire neural networks and its gradients) as a dataflow graph. You then offload the computation of the entire dataflow graph (or a subgraph of it) to a TensorFlow Session, which is able to execute the whole computation much more efficiently than executing the operations one-by-one.
- variables: hold and update parameters(e.g. weights); in-memory buffers containing tensors.
- placeholder:
The tf.placeholder() op allows you to define tensors that must be fed, and optionally allows you to constrain their shape as well.
If t is a Tensor object, t.eval() is shorthand for sess.run(t)
A tensor has a static type and dynamic dimensions.
Convert to Tensor
tf.convert_to_tensor(value, dtype=None, name=None, as_ref=False)
It accepts Tensor objects, numpy arrays, Python lists, and Python scalars
>>> tf.convert_to_tensor([1.0, 2.0])
<tf.Tensor 'Const:0' shape=(2,) dtype=float32>
>>> tf.convert_to_tensor(np.array([1.0, 2.0]))
<tf.Tensor 'Const_5:0' shape=(2,) dtype=float64>
>>> tf.convert_to_tensor(np.array([1.0, 2.0]), dtype=np.float32)
<tf.Tensor 'Const_6:0' shape=(2,) dtype=float32>
Example
>>> x = tf.constant(35, name='x')
>>> y = tf.Variable(x + 5)
>>> sess.run(tf.initialize_all_variables())
>>> sess.run(y)
40
>>> sess.close()
Random
This will get a tf.Tensor
object
tf.random_uniform([1], -1.0, 1.0)
tf.random_normal([784, 200], stddev=0.35)
Temp Notes
TensorFlow separates the definition of computations from their execution even further by having them happen in separate places: a graph defines the operations, but the operations only happen within a session.
A graph is like a blueprint, and a session is like a construction site.
>>> import tensorflow as tf
>>> sess = tf.Session()
>>> sess.run(hello)
b'Hello, World!'
>>> hello = tf.constant('Hello')
>>> world = tf.constant('World')
>>> hello
<tf.Tensor 'Const_2:0' shape=() dtype=string>
In sess.run(), tensor will be materialized
>>> sess.run(hello)
b'Hello'
and with some calculations
>>> sess.run(hello + " ")
b'Hello '
>>> sess.run(" ")
Traceback (most recent call last):
...
KeyError: "The name ' ' refers to an Operation not in the graph."
>>> sess.run(hello + " " + world)
b'Hello World'
Matrix Calculation
>>> import numpy as np
>>> matrix1 = 10 * np.random.random_sample((3, 4))
>>> matrix2 = 10 * np.random.random_sample((4, 6))
>>> np.matmul(matrix1, matrix2)
array([[ 159.40614027, 162.54034372, 73.28459269, 142.01526423,
127.91968372, 98.2847073 ],
[ 94.05143578, 94.8883029 , 37.81891208, 79.43049234,
68.99657767, 49.1148775 ],
[ 131.34252857, 98.32689307, 45.45396031, 91.68707474,
55.91695781, 76.97886235]])
>>> sess.run(tf.matmul(tf.constant(matrix1), tf.constant(matrix2)))
array([[ 159.40614027, 162.54034372, 73.28459269, 142.01526423,
127.91968372, 98.2847073 ],
[ 94.05143578, 94.8883029 , 37.81891208, 79.43049234,
68.99657767, 49.1148775 ],
[ 131.34252857, 98.32689307, 45.45396031, 91.68707474,
55.91695781, 76.97886235]])
>>> tf.get_default_graph().get_operations()[-3].node_def
name: "Const_4"
op: "Const"
attr {
key: "dtype"
value {
type: DT_DOUBLE
}
}
attr {
key: "value"
value {
tensor {
dtype: DT_DOUBLE
tensor_shape {
dim {
size: 3
}
dim {
size: 4
}
}
tensor_content: "\005\252\204x\024\363\036@U\2118W\323R\026@\226\3306-=r\033@v\022\301\3002\375\022@\r?R\376v\265\022@\333\356a\t\210\016\001@*\255\300\241\360\022\020@\344\320\033\013\214\006\013@\253/\343\266\265x\027@\366\177\331\013F\355\373?\301\324D\326\273Q!@@\273\247k)\265\314?"
}
}
}
>>> tf.get_default_graph().get_operations()[-2].node_def
name: "Const_5"
op: "Const"
attr {
key: "dtype"
value {
type: DT_DOUBLE
}
}
attr {
key: "value"
value {
tensor {
dtype: DT_DOUBLE
tensor_shape {
dim {
size: 4
}
dim {
size: 6
}
}
tensor_content: "zz\255\266\314\363\035@&\240\'\214\213\315 @\351\333)N\031E\364?\262`c\344\330\035\017@\300\033\025\263)\216\247?\326EJ\253\245J\347?\354#\354\031\204\343\005@\031\317Ue\301\331\025@\312\360j\031\031\346\026@@aGnU\010\034@*\026P\337\270\375#@Zrx\241\226,\037@\371\240.74\331\"@\240W\026\312\223n\021@\030N.xax\t@\306W\312\032\201\206\031@\340\342\351_F\264\020@b\314\264\003\241?\033@\201\374R\224\311$\022@\335\240\241\305\205L\037@\212\374\203\005XS\000@D/w\255>t\030@\326\020/\324&+\"@\236\333\251\362qI\340?"
}
}
}
>>> tf.get_default_graph().get_operations()[-1].node_def
name: "MatMul"
op: "MatMul"
input: "Const_4"
input: "Const_5"
attr {
key: "T"
value {
type: DT_DOUBLE
}
}
attr {
key: "transpose_a"
value {
b: false
}
}
attr {
key: "transpose_b"
value {
b: false
}
}
TensorFlow uses protocol buffers internally.
why separation of definition and execution:
To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data.
TensorFlow also does its heavy lifting outside Python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python.
>>> x = tf.Variable(1.0)
>>> [op.name for op in graph.get_operations()]
[... 'Variable/initial_value', 'Variable', 'Variable/Assign', 'Variable/read']
>>> sess.run(tf.initialize_all_variables())
>>> y = tf.constant(1.0)
>>> [op.name for op in graph.get_operations()]
[... 'Const']
A placeholder is simply a variable that we will assign data to at a later date.
graph
They are the same!
>>> tf.get_default_graph()
>>> <tensorflow.python.framework.ops.Graph object at 0x1047b87b8>
>>> sess.graph
>>> <tensorflow.python.framework.ops.Graph object at 0x1047b87b8>
https://www.oreilly.com/learning/hello-tensorflow
Placeholder
A placeholder is simply a variable that we will assign data to at a later date.
>>> x = tf.placeholder("int32", 3)
>>> y = x * 2
>>> sess.run(y, feed_dict={x:[1,2,3]})
array([2, 4, 6], dtype=int32)
>>> x = tf.constant([1,2,3])
>>> y = x * 2
>>> sess.run(y)
array([2, 4, 6], dtype=int32)
>>> sess.run(x, feed_dict={x:[1,2,3]})
Traceback (most recent call last):
File "/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 658, in _do_call
...
tensorflow.python.pywrap_tensorflow.StatusNotOK: Invalid argument: Placeholder_3:0 is both fed and fetched.
x alone is a placeholder
>>> x
<tf.Tensor 'Placeholder_3:0' shape=(3,) dtype=int32>
these are operations:
>>> x + 1
<tf.Tensor 'add_13:0' shape=(3,) dtype=int32>
>>> x * 1
<tf.Tensor 'mul_4:0' shape=(3,) dtype=int32>
x = tf.placeholder("float", [None, 3])
None: any number of rows
Keras
An alternative(better?) API layer for TensorFlow.
Trouble Shooting
Error:
ImportError: No module named 'keras.layers'; 'keras' is not a package
Solution:
- Make sure keras is correctly installed
- Make sure your script is NOT named
keras.py
, otherwise python will look into YOUR script forkeras.layers
(or other packages)