NumPy: Manipulating arrays
April 24, 2019
In the last two tutorials, we learnt about NumPy arrays, indexing and slicing, and some mathematical operations and functions NumPy supports.
In this tutorial, we will learn how to manipulate NumPy arrays in various ways — including reshaping an array, concatenating multiple arrays, appending elements or rows, etc.
You are encouraged to follow along with the tutorial and play around with NumPy, tinkering with the code and making sure you’re getting the hang of it. Let’s get started!
Set-up
Let’s start by defining some arrays
import numpy as np
# Create a 1-dimensional array
a = np.array([1,4,2,3,5,7,8,6])
print('array "a"')
print(a)
print()
# Create a 2-dimensional array
b = np.array([[1,0,1,0,2,3], [1,3,0,1,2,0], [0,1,0,0,1,3]])
print('array "b"')
print(b)
print()
np.random.seed(42) # set a seed to that we always get the same random values
c = b + 5
print('array "c"')
print(c)
Reshape and flatten
Reshape can be used to change the shape of an array.
# Reshape to shape 4 x 2
a = a.reshape(4, 2)
print('array "a" reshaped to 4 x 2')
print(a)
print()
# Reshape to shape 2 x 4
a = a.reshape(2, 4)
print('array "a" reshaped to 2 x 4')
print(a)
As you can see above, during reshaping to a 2-D object, values are filled row by row.
Note that the number of elements in the original array and the final array must be equal. Otherwise, NumPy throws an error:
a.reshape(3, 5)
“Flatten” is equivalent to reshaping to vector of length a.size
.
a = a.flatten()
print('Flattened array "a"')
print(a)
Resize
If we would like to reshape while changing the total number of elements in the array, we can use resize. Resize will repeat elements (if resizing to a larger size), or it will throw away elements (if resizing to a smaller size).
print(np.resize(a, (3, 5)))
Broadcast
Broadcasting makes copies of the existing array.
The first argument is the array itself, and the second argument is the shape of the new array.
Since broadcasting makes copies of the original array, the trailing dimensions of the new array must match the dimensions of the original array. For example, we can broadcast an array of shape (8, )
to an array of shape (6, 8)
. But we cannot broadcast it to shape (6, 4)
.
print('Broadcasting array "a" to a new size')
print(np.broadcast_to(a, (6, 8)))
Let’s try broadcasting the 2D array. The original shape of B was (3, 6)
, and we will broadcast it to the shape (2, 3, 6)
. This will repeat the array twice.
print(b)
print()
print('Broadcasting array "b" to a new size')
z = np.broadcast_to(b, (2, 3, 6))
print('New shape', z.shape)
print(z)
Expanding and squeezing dimensions
expand_dims
and squeeze
can be used to add or remove dimensions from an array.
# Expand the shape of an array by inserting axis
z = np.expand_dims(b, axis=1) # expand at position 1
print('Expanded array "z"')
print('Shape of z', z.shape)
print(z)
print()
# Remove single-dimensional entries from the shape of an array
print('Squeezed array "z"')
print(np.squeeze(z))
Concatenating and stacking
concatenate
and stack
can be used to join multiple arrays into one. concatenate
joins arrays along an existing axis, thereby, the total number of dimensions stays the same. stack
on the other hand, creates a new dimension.
Let’s start by taking a look at arrays b
and c
:
print('array "b"')
print(b)
print()
print('array "c"')
print(c)
Now, let’s try concatenating. concatenate
joins arrays along an existing axis:
# Join arrays along an existing axis, axis 0
print('Concatenated array "b" and "c", along axis 0')
print(np.concatenate((b, c), axis=0))
print()
# Join arrays along an existing axis, axis 1
print('Concatenated arrays "b" and "c" along axis 1')
print(np.concatenate((b, c), axis=1))
Let’s also try stacking. Unlike concatenation, stacking creates a new dimension:
# Join arrays along a new axis, 0
print('Stack arrays "b" and "c" along a new axis, 0')
z = np.stack((b, c), 0)
print(z.shape)
print(z)
print()
# Join arrays along a new axis, 1
print('Stack arrays "b" and "c" along a new axis, 1')
z = np.stack((b, c), 1)
print(z.shape)
print(z)
Splitting
split
, hsplit
(horizontal split) and vsplit
(vertical split) can be used to split an array into equal-sized subarrays.
Each of these 3 functions returns a list of arrays.
Again, let’s start by taking a look at arrays b
and c
:
print('array "b"')
print(b)
print()
print('array "c"')
print(c)
By default, split
splits by the axis 0. For 2D arrays, this means splitting vertically:
print('Split array "b"')
print(np.split(b, 3))
print()
print('Split array "c"')
print(np.split(c, 3))
Now, let’s look at some ways to split horizontally:
print('Split array "c" into 3 parts along axis 1')
print(np.split(c, 3, axis=1))
print()
print('Horizontally split array "b" into 2 parts')
print(np.hsplit(b, 2))
print()
print('Horizontally split array "b" into 3 parts')
print(np.hsplit(b, 3))
Append, insert and delete
And finally, append
, insert
and delete
can be used to insert and delete elements from an array (the same way as in Python). append
is equivalent to inserting elements to the end of the array.
print('array "a"')
print(a)
print()
# Append values to the end of an array
print('Values appended to array "a"')
print(np.append(a, [1, 2]))
print()
# Insert values (0, 0) along the given axis (0) at the given index (3)
print('Values inserted to array "a"')
print(np.insert(a, 3, [0, 0], axis=0))
print()
# Delete the element at given locations (1) along an axis (0)
print('Values deleted from array "a"')
print(np.delete(a, [1, 3], axis=0))
Let’s see some examples with 2D arrays:
# 2d array "b"
print('array "b"')
print(b)
print()
# Append row of values to the end of a 2darray
# Notice that the argument itself is a 2D array.
print('Values appended to array "b" along axis 0')
print(np.append(b, [[1, 2, 3, 4, 5, 6]], axis=0))
print()
# Append column of values to the end of a 2darray
print('Values appended to array "b" along axis 1')
print(np.append(b, [[7], [8], [9]], axis=1))
print()
# Insert values (5, 5, 5) along the given axis (1) at the given index (2)
print('Scalar values inserted to array "b" in column position 2')
print(np.insert(b, 2, 5, axis=1))
print()
# Insert values (1, 2, 3) along the given axis (1) at the given index (1)
print('Column vector inserted to array "b" in column position 1')
print(np.insert(b, [1], [[1],[2],[3]], axis=1))
print()
# Delete the row/column/array at given locations (1) along an axis (0)
print('Row deleted from array "b"')
print(np.delete(b, 1, axis=0))
print()
# Delete the row/column/array at given locations (2) along an axis (1)
print('Column deleted from array "b"')
print(np.delete(b, 2, axis=1))
Summary
In this tutorial we learnt how to
- change the shape of a NumPy array using
reshape
andflatten
- change the size of a NumPy array using
resize
andbroadcast_to
- join arrays using
concatenate
andstack
- split arrays using
split
append
,insert
anddelete
items / subarrays from an array
This tutorial concludes our section on NumPy. In the upcoming project, you’ll use NumPy do to some practical data exploration.