psst... I heard you might be interested in bitcoin, blockchains and cryptocurrencies.

If so, you might be interested in our new tutorial page, LearningBlockchains.com! It has great lessons on using crypto, coding for it, and more lessons are coming all the time.

Check it out



Dimensionality and Broadcasting

When we operate on arrays of different dimensionality, they can combine in different ways, either elementwise or through broadcasting.

Let’s start from scratch and build up to more complex examples. In the below example, we have a TensorFlow constant representing a single number.

import tensorflow as tf

a = tf.constant(3, name='a')

with tf.Session() as session:
    print(session.run(a))

Not much of a surprise there! We can also do computations, such as adding another number to it:

a = tf.constant(3, name='a')
b = tf.constant(4, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Let’s extend this concept to a list of numbers. To start, let’s create a list of three numbers, and then another list of numbers to it:

a = tf.constant([1, 2, 3], name='a')
b = tf.constant([4, 5, 6], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

This is known as an elementwise operation, where the elements from each list are considered in turn, added together and then the results combined.

What happens if we just add a single number to this list?

a = tf.constant([1, 2, 3], name='a')
b = tf.constant(4, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Is this what you expected? This is known as an broadcasted operation. Our primary object of reference was a, which is a list of numbers, also called an array or a one-dimensional vector. Adding a single number (called a scalar) results in an broadcasted operation, where the scalar is added to each element of the list.

Now let’s look at an extension, which is a two-dimensional array, also known as a matrix. This extra dimension can be thought of as a “list of lists”. In other words, a list is a combination of scalars, and a matrix is a list of lists.

That said, how do operations on matrices work?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([[1, 2, 3], [4, 5, 6]], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

That’s elementwise. If we add a scalar, the results are fairly predictable:

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant(100, name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

Here is where things start getting tricky. What happens if we add a one-dimensional array to a two-dimensional matrix?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([100, 101, 102], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

In this case, the array was broadcasted to the shape of the matrix, resulting in the array being added to each row of the matrix. Using this terminology, a matrix is a list of rows.

What if we didn’t want this, and instead wanted to add b across the columns of the matrix instead?

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([100, 101,], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

This didn’t work, as TensorFlow attempted to broadcast across the rows. It couldn’t do this, because the number of values in b (2) was not the same as the number of scalars in each row (3).

We can do this operation by creating a new matrix from our list instead.

a = tf.constant([[1, 2, 3], [4, 5, 6]], name='a')
b = tf.constant([[100], [101]], name='b')
add_op = a + b

with tf.Session() as session:
    print(session.run(add_op))

What happened here? To understand this, let’s look at matrix shapes.

a.shape
    TensorShape([Dimension(2), Dimension(3)])
b.shape
    TensorShape([Dimension(2), Dimension(1)])

You can see from these two examples that a has two dimensions, the first of size 2 and the second of size 2. In other words, it has two rows, each with three scalars in it.

Our b constant also has two dimensions, two rows with one scalar in each. This is not the same as a list, nor is it the same as a matrix if one row of two scalars.

Due to the fact that the shapes match on the first dimension but not the second, the broadcasting happened across columns instead of rows. For more on broadcasting rules, see here.

Exercises

Stuck? Looking for more content?

If you are looking for solutions on the exercises, or just want to see how I solved them, then our solutions bundle is what you are after. Buying the bundle gives you free updates for life - meaning when we add a new lesson, you get an updated bundle with the solutions. It's just $7, and it also helps us to keep running the site with free lessons.

  1. Create a 3-dimensional matrix. What happens if you add a scalar, array or matrix to it?
  2. Use tf.shape (it’s an operation) to get a constant’s shape during operation of the graph.
  3. Think about use cases for higher-dimensional matrices. In other words, where might you need a 4D matrix, or even a 5D matrix? Hint: think about collections rather than single objects.