Dimensionality and Broadcasting

Recommended reading:

When we operate on arrays of different dimensionality, they can combine in different ways, either elementwise or through broadcasting.

Let’s start from scratch and build up to more complex examples. In the below example, we have a TensorFlow constant representing a single number.

import tensorflow as tf

a = tf.constant(3, name='a')

with tf.Session() as session:
    print(session.run(a))

Not much of a surprise there! We can also do computations, such as adding another number to it:

a = tf.constant(3, name='a')
b = tf.constant(4, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Let’s extend this concept to a list of numbers. To start, let’s create a list of three numbers, and then another list of numbers to it:

a = tf.constant([1, 2, 3], name='a')
b = tf.constant([4, 5, 6], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

This is known as an elementwise operation, where the elements from each list are considered in turn, added together and then the results combined.

What happens if we just add a single number to this list?

a = tf.constant([1, 2, 3], name='a')
b = tf.constant(4, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Is this what you expected? This is known as an broadcasted operation. Our primary object of reference was a, which is a list of numbers, also called an array or a one-dimensional vector. Adding a single number (called a scalar) results in an broadcasted operation, where the scalar is added to each element of the list.

Now let’s look at an extension, which is a two-dimensional array, also known as a matrix. This extra dimension can be thought of as a “list of lists”. In other words, a list is a combination of scalars, and a matrix is a list of lists.

That said, how do operations on matrices work?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([[1, 2, 3], [4, 5, 6]], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

That’s elementwise. If we add a scalar, the results are fairly predictable:

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant(100, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Here is where things start getting tricky. What happens if we add a one-dimensional array to a two-dimensional matrix?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([100, 101, 102], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

In this case, the array was broadcasted to the shape of the matrix, resulting in the array being added to each row of the matrix. Using this terminology, a matrix is a list of rows.

What if we didn’t want this, and instead wanted to add b across the columns of the matrix instead?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([100, 101,], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

This didn’t work, as TensorFlow attempted to broadcast across the rows. It couldn’t do this, because the number of values in b (2) was not the same as the number of scalars in each row (3).

We can do this operation by creating a new matrix from our list instead.

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([[100], [101]], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

What happened here? To understand this, let’s look at matrix shapes.

a.shape
    TensorShape([Dimension(2), Dimension(3)])
b.shape
    TensorShape([Dimension(2), Dimension(1)])

You can see from these two examples that a has two dimensions, the first of size 2 and the second of size 2. In other words, it has two rows, each with three scalars in it.

Our b constant also has two dimensions, two rows with one scalar in each. This is not the same as a list, nor is it the same as a matrix if one row of two scalars.

Due to the fact that the shapes match on the first dimension but not the second, the broadcasting happened across columns instead of rows. For more on broadcasting rules, see here.

Exercises

  1. Create a 3-dimensional matrix. What happens if you add a scalar, array or matrix to it?
  2. Use tf.shape (it’s an operation) to get a constant’s shape during operation of the graph.
  3. Think about use cases for higher-dimensional matrices. In other words, where might you need a 4D matrix, or even a 5D matrix? Hint: think about collections rather than single objects.

Keep going!

We have an increasing set of lessons that we hope guides you through learning this powerful library. Follow these links to keep going to our next lesson.

You can also use the nav menu at the top of the page to go directly to a specific lesson.

Coming soon (although not written by us):

Get updates

Sign up here to receive infrequent emails from us about updates to the site and when new lessons are released.



* indicates required

If you have any feedback, please see our page here. If you spot any errors with our lessons, please direct them to our Github page with the name of the lesson in which the error resides, so that we can resolve them and close them off there.

If you have larger questions that may involve consultancy, please contact us here