Table of Contents
- Introduction
- The
numpy.split
Function - The
numpy.array_split
Function - The
numpy.hsplit
Function - The
numpy.vsplit
Function
Introduction
The opposite of concatenation is splitting, which means breaking up an array into multiple sub-arrays. It can be implemented by the functions numpy.split
, numpy.hsplit
and numpy.vsplit
.
The numpy.split Function
The numpy.split
function splits an array into multiple sub-arrays as views of the original array.
Syntax
Thenumpy.split
function.
1numpy.split(ary, indices_or_sections, axis=0)
Parameter | Required? | Default Value | Description |
---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
axis |
❌ No | 0 | The axis along which to split. |
Splitting of 1-D Arrays
From the description in the previous section, there are two interpretations of indices_or_sections
. For example, if ary
is a 1-D array of size of 10. Then we can deduce the following:
indices_or_sections |
Description | Example | Output |
---|---|---|---|
sections |
An integer N. | 2 |
ary[:5] ,ary[5:] |
indices |
A 1-D array of sorted integers. | [2,3] |
ary[:2] ,ary[2:3] , ary[3:] |
To get started with our examples, we first create a seeded 1-D random array.
1import numpy as np
2R = np.random.RandomState(19) #set a random seed
3a1 = R.choice(10, size=10, replace=False) # array of random integers
4print(a1)
[1 7 9 6 8 4 3 0 2 5]
We now split the array into 2 equal halves by specifying sections=2
.
Example
Splitting a 1-D array withnumpy.split
by specifying sections
.
1np.split(a1,2)
[array([1, 7, 9, 6, 8]), array([4, 3, 0, 2, 5])]
If we were to specify sections=3
, an error will be reported. This is because the array cannot be split into 3 equal sub-arrays.
1np.split(a1,3)
Traceback (most recent call last):
ValueError: array split does not result in an equal division
Always ensure that the size of the array is divisible by the number of sections specified.
Example
Splitting a 1-D array withnumpy.split
by specifying an array of indices.
We now split the array into 3 subarrays by specifying indices
.
1x1, x2, x3 = np.split(a1, [2, 6])
2print(x1, x2, x3)
[1 7] [9 6 8 4] [3 0 2 5]
$N$ split points lead to $N + 1$ sub-arrays.
In the above example, 2 split points lead to 3 sub-arrays.
Recall that the resulting sub-arrays are views of the original parent array. Let’s now modify one of the sub-arrays and observe the impact on the parent array.
Example
Modifying a sub-array obtained from a 1-D array usingnumpy.split
.
1x2[0]=99
2print(x2)
3print(a1)
[99 6 8 4]
[ 1 7 99 6 8 4 3 0 2 5]
We see that the parent array is modified as well.
ADVERTISEMENT
Splitting of 2-D Arrays
We first create a seeded 2-D random array.
1a2 = R.choice(3*4, size=(3,4), replace=False)
2print(a2)
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
Now, let’s split a2
into 2 equal sub-arrays using sections=2
along axis=1
.
Example
Splitting a 2-D array into 2 equal parts alongaxis=1
.
1y1, y2 = np.split(a2, 2, axis=1)
2print(y1)
3print(y2)
[[7 1]
[0 6]
[3 8]]
[[ 9 11]
[10 4]
[ 2 5]]
Let’s split a2
into 2 sub-arrays using indices=[1]
along axis=0
.
Example
Splitting a 2-D array into two sub-arrays alongaxis=0
.
1z1, z2 = np.split(a2, [1], axis=0)
2print(z1)
3print(z2)
[[ 7 1 9 11]]
[[ 0 6 10 4]
[ 3 8 2 5]]
The split points need to be specified as a sorted array. The above will not work if we had used np.split(a2, 1, axis=0)
rather than np.split(a2, [1], axis=0)
.
Splitting of 3-D Arrays
Splitting of 3-D arrays works in a similar manner. We first create a 3-D array by concatenating two 2-D random arrays.
Example
Create a 3-D array by concatenating two 2-D random arrays.1a21 = R.choice(3*4, size=(3,4), replace=False) # 2-D random array
2print(a21)
[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]
For the concatenation, we invoke the numpy.stack()
function.
1a3 = np.stack((a2, a21),axis=0) # concatenation
2print(a3)
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
Now, let’s split a3
into 2 equal sub-arrays using sections=2
along axis=2
.
Example
Splitting a 3-D array into 2 equal parts alongaxis=2
.
1d1, d2 = np.split(a3, 2, axis=2)
2print("First sub-array:\n", d1, "\n")
3print("Second sub-array:\n",d2)
First sub-array:
[[[ 7 1]
[ 0 6]
[ 3 8]]
[[ 8 7]
[ 3 11]
[ 2 6]]]
Second sub-array:
[[[ 9 11]
[10 4]
[ 2 5]]
[[ 9 5]
[ 0 4]
[10 1]]]
For the above example, two 3-D sub-arrays are produced, each with shape (2,3,2)
. In fact, we can easily verify this fact using the shape
attribute.
1print(a3.shape) # original 3-D array
2print(d1.shape) # sub-array 1
3print(d2.shape) # sub-array 2
(2, 3, 4)
(2, 3, 2)
(2, 3, 2)
From the above, we can easily see that the 3-D array has been split along axis=2
(third dimension).
ADVERTISEMENT
The numpy.array_split Function
A variation of the numpy.split
function is that the numpy.array_split
function that splits an array into multiple sub-arrays as views of the original array.
numpy.array_split
is identical to numpy.split
except that numpy.array_split
allows indices_or_sections
to be an integer that does not equally divide the axis.
In other words, the size of the dimension along where splitting occurs does not need to be divisible by the indices_or_sections
parameter. For an array of length l
that is to be split into N
sections, it returns l%N
sub-arrays of size l//n + 1
and the rest of size l//N
. Refer to Python mathematical symbols.
The following example illustrates the difference.
Example
Splitting a 1-D array usingnumpy.split
with sections=3
.
1np.split(a1,3)
Traceback (most recent call last):
ValueError: array split does not result in an equal division
The above illustrates that numpy.split
cannot handle an N
value that results in unequal divisions.
Example
Splitting a 1-D array usingnumpy.array_split
with sections=3
.
1np.array_split(a1,3)
[array([1, 7, 9, 6]), array([8, 4, 3]), array([0, 2, 5])]
The numpy.hsplit Function
The numpy.hsplit
function splits an array into multiple sub-arrays horizontally (column-wise).
Syntax
Thenumpy.hsplit
function.
1numpy.hsplit(ary, indices_or_sections)
Parameter | Required? | Default Value | Description |
---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
numpy.hsplit
is equivalent to numpy.split
with axis=1
(second axis). The array is always split along the second axis except for 1-D arrays, where it is split along axis=0.
The advantage of using numpy.hsplit
over numpy.split
is that we do not need to specify the axis
parameter.
Example
Splitting a 1-D array usingnumpy.hsplit
with sections=2
.
Let’s reuse our 1-D array a1
and perform a hsplit
into two equal sub-arrays.
1print(a1)
2b1, b2 = np.hsplit(a1, 2)
3print(b1,b2)
[1 7 9 6 8] [4 3 0 2 5]
Note that the above can also be achieved using np.split(a1, 2)
.
Let’s now look at how numpy.hsplit
works on a 2-D array. We shall reuse our 2-D array a2
and perform a hsplit
into three sub-arrays using an array of indices.
Example
Splitting a 2-D array withnumpy.hsplit
by specifying indices=[1,3]
.
1print(a2)
2c1, c2, c3 = np.hsplit(a2, [1,3])
3print(c1,c2,c3, sep='\n')
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[7]
[0]
[3]]
[[ 1 9]
[ 6 10]
[ 8 2]]
[[11]
[ 4]
[ 5]]
Note that the above is equivalent to using np.split(a2, [1, 3], axis=1)
as demonstrated below.
Example
Splitting a 2-D array usingnumpy.split
with axis=1
.
1c1, c2, c3 = np.split(a2, [1, 3], axis=1)
2print(c1,c2,c3, sep='\n')
[[7]
[0]
[3]]
[[ 1 9]
[ 6 10]
[ 8 2]]
[[11]
[ 4]
[ 5]]
ADVERTISEMENT
The numpy.vsplit Function
The numpy.vsplit
function splits an array into multiple sub-arrays vertically (row-wise). The function only works on arrays of 2 or more dimensions.
Syntax
Thenumpy.vsplit
function.
1numpy.vsplit(ary, indices_or_sections)
Parameter | Required? | Default Value | Description |
---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
numpy.vsplit
is equivalent to numpy.split
with axis=0
(first axis). The array is always split along the first axis regardless of the array dimension.
Once again, the advantage of using numpy.vsplit
over numpy.split
is that we do not need to specify the axis parameter which is always taken to be the first axis.
Example
Splitting a 2-D array usingnumpy.vsplit
with indices=[1,2]
.
We consider our 2-D array a2
and perform a vsplit
into two three sub-arrays.
1print(a2)
2d1, d2, d3 = np.vsplit(a2, [1,2])
3print(d1,d2,d3, sep='\n')
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[ 7 1 9 11]]
[[ 0 6 10 4]]
[[3 8 2 5]]
Note that the above can also be achieved using np.split(a2, [1,2], axis = 0)
.
Example
Splitting a 3-D array usingnumpy.vsplit
with indices=[1]
.
We consider the 3-D array a3
and perform a vsplit
into two two sub-arrays.
1print(a3) # 3-D array
2e1, e2 = np.vsplit(a3, [1]) # split along axis=0
3print("First sub-array:\n", e1, "\n")
4print("Second sub-array:\n",e2)
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
First sub-array:
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]]
Second sub-array:
[[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
- The above can also be achieved using
np.split(a3, [1], axis = 0)
. - Recall that the 3-D array
a3
was previously concatenated from two 2-D arrays usingnp.stack
. We now usenumpy.vsplit
to split the 3-D array back into the two 2-D arrays.
numpy.vsplit
does not always mean splitting vertically (row-wise). Instead, the split is always along axis=0 (first dimension). For the case of a 3-D array, this means along the frame axis.