Arrays: Working with Images

In this tutorial, we are going to work with an image, in order to visualise changes to an array. Arrays are powerful structures, as we saw briefly in the previous tutorial. Generating interesting arrays can be difficult, but images provide a great option.

First, download this image (Right Click, and look for an option like “Save Image As…”) to your computer.

This image comes from Wikimedia Commons, by user Uoaei1.

To work with images, we will need matplotlib. We will also need the pillow library, which overrides the deprecated PIL library for working with images. You can install both in your environment using Anaconda’s installation method:

conda install matplotlib pillow

To load the image, we use matplotlib’s image module:

import matplotlib.image as mpimg

# First, load the image
filename = "MarshOrchid.jpg"
image = mpimg.imread(filename)

# Print out its shape
print(image.shape)

The above code reads in the image as a NumPy array, and prints out the size. Note that the filename needs to be a full path (absolute or relative) to the downloaded image file.

You’ll see the output, which is (5528, 3685, 3). This means the image is 5528 pixels high, 3685 pixels wide, and 3 colors “deep”.

You can view the current image using pyplot, like so:

import matplotlib.pyplot as plt
plt.imshow(image)
plt.show()

Now that we have our image, lets use TensorFlow to do some changes to it.

Geometric Manipulations

The first transformation we will perform is a transpose, turning the image 90 degrees counter-clockwise. The full program is below, most of which you have seen.

import tensorflow as tf
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

# First, load the image again
filename = "MarshOrchid.jpg"
image = mpimg.imread(filename)

# Create a TensorFlow Variable
x = tf.Variable(image, name='x')

model = tf.initialize_all_variables()

with tf.Session() as session:
    x = tf.transpose(x, perm=[1, 0, 2])
	session.run(model)
	result = session.run(x)


plt.imshow(result)
plt.show()

The result of the transpose operation:

The new bit is this line:

x = tf.transpose(x, perm=[1, 0, 2])

This line uses TensorFlow’s transpose method, swapping the axes 0 and 1 around using the perm parameter (axis 2 stays where it is).

The next manipulation we will do is a flip (left-right), swapping the pixels from one side to another. TensorFlow has a method for this called reverse_sequence, but the signature is a bit odd. Here is what the documentation says (from that page):

tf.reverse_sequence(input, seq_lengths, seq_dim, batch_dim=None, name=None)

Reverses variable length slices.

This op first slices input along the dimension batch_dim, and for each slice i, reverses the first seq_lengths[i] elements along the dimension seq_dim.

The elements of seq_lengths must obey seq_lengths[i] < input.dims[seq_dim], and seq_lengths must be a vector of length input.dims[batch_dim].

The output slice i along dimension batch_dim is then given by input slice i, with the first seq_lengths[i] slices along dimension seq_dim reversed.

For this function, it can be best thought of as:

  1. Iterate through the array according to batch_dim. Setting batch_dim=0 means we go through the rows (top to bottom).
  2. For each item in the iteration
    • Slice a second dimension, denoted by seq_dim. Setting seq_dim=1 means we go through the columns (left to right).
    • The slice for the nth item in the iteration is denoted by the nth item in seq_lengths

Lets see it in action:

import numpy as np
import tensorflow as tf
import matplotlib.image as mpimg
import matplotlib.pyplot as plt

# First, load the image again
filename = "MarshOrchid.jpg"
image = mpimg.imread(filename)
height, width, depth = image.shape

# Create a TensorFlow Variable
x = tf.Variable(image, name='x')

model = tf.initialize_all_variables()

with tf.Session() as session:
    x = tf.reverse_sequence(x, [width] * height, 1, batch_dim=0)
    session.run(model)
    result = session.run(x)

print(result.shape)
plt.imshow(result)
plt.show()

The new bit is this line:

x = tf.reverse_sequence(x, np.ones((height,)) * width, 1, batch_dim=0)

It iterates over the image top to bottom (along its height), and slices left to right (along its width). From here, it then takes a slice of size width, where width is the width of the image.

The code np.ones((height,)) * width creates a NumPy array filled with the value width. This is not very efficient! Unfortunately, at time of writing, it doesn’t appear that this function allows you to specify just a single value.

The result of the “fliplr” operation:

Exercises

1) Combine the transposing code with the flip code to rotate clock wise.

2) Currently, the flip code (using reverse_sequence) requires width to be precomputed. Look at the documentation for the tf.shape function, and use it to compute the width of the x variable within the session.

3) Perform a “flipud”, which flips the image top-to-bottom.

4) Compute a “mirror”, where the first half of the image is copied, flipped (l-r) and then copied into the second half.

Stuck?

If you need some extra guidance, and want to support the site, we have created a package with answers to all exercises. In addition, it contains some extra pointers on exercises and new features not included in these lessons.

It's just $7, and you can get it here:

Keep going!

Support the site

You can also support LearningTensorFlow.com by becoming a patron at Patreon. If we have saved you trawling through heavy documentation, or given you a pointer on where to go next, help us to create new lessons and keep the site running.

We have an increasing set of lessons that we hope guides you through learning this powerful library. Follow these links to keep going to our next lesson.

You can also use the nav menu at the top of the page to go directly to a specific lesson.

Get updates

Sign up here to receive infrequent emails from us about updates to the site and when new lessons are released.



* indicates required

Does your business need a new logo?

Are you looking to create a logo? Or is it time for a logo make over? Recently dataPipeline went under a logo transformation, (the result is on the right).

We used 99designs, 99designs are a company that provide you with a global community of professional designers, to create your logo.

You get a bunch of designs from a lot of designers. You then provide feed back and select your favourite ones. After 7 days you choose your winning design!

99Designs: Get a design you’ll love — guaranteed