Table of Contents
- Introduction
- The
numpy.splitFunction - The
numpy.array_splitFunction - The
numpy.hsplitFunction - The
numpy.vsplitFunction
Introduction
The opposite of concatenation is splitting, which means breaking up an array into multiple sub-arrays. It can be implemented by the functions numpy.split, numpy.hsplit and numpy.vsplit.
The numpy.split Function
The numpy.split function splits an array into multiple sub-arrays as views of the original array.
Syntax
Thenumpy.split function.
1numpy.split(ary, indices_or_sections, axis=0)
| Parameter | Required? | Default Value | Description |
|---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
axis |
❌ No | 0 | The axis along which to split. |
Splitting of 1-D Arrays
From the description in the previous section, there are two interpretations of indices_or_sections. For example, if ary is a 1-D array of size of 10. Then we can deduce the following:
indices_or_sections |
Description | Example | Output |
|---|---|---|---|
sections |
An integer N. | 2 |
ary[:5],ary[5:] |
indices |
A 1-D array of sorted integers. | [2,3] |
ary[:2],ary[2:3], ary[3:] |
To get started with our examples, we first create a seeded 1-D random array.
1import numpy as np
2R = np.random.RandomState(19) #set a random seed
3a1 = R.choice(10, size=10, replace=False) # array of random integers
4print(a1)
[1 7 9 6 8 4 3 0 2 5]
We now split the array into 2 equal halves by specifying sections=2.
Example
Splitting a 1-D array withnumpy.split by specifying sections.
1np.split(a1,2)
[array([1, 7, 9, 6, 8]), array([4, 3, 0, 2, 5])]
If we were to specify sections=3, an error will be reported. This is because the array cannot be split into 3 equal sub-arrays.
1np.split(a1,3)
Traceback (most recent call last):
ValueError: array split does not result in an equal division
Always ensure that the size of the array is divisible by the number of sections specified.
Example
Splitting a 1-D array withnumpy.splitby specifying an array of indices.
We now split the array into 3 subarrays by specifying indices.
1x1, x2, x3 = np.split(a1, [2, 6])
2print(x1, x2, x3)
[1 7] [9 6 8 4] [3 0 2 5]
$N$ split points lead to $N + 1$ sub-arrays.
In the above example, 2 split points lead to 3 sub-arrays.
Recall that the resulting sub-arrays are views of the original parent array. Let’s now modify one of the sub-arrays and observe the impact on the parent array.
Example
Modifying a sub-array obtained from a 1-D array usingnumpy.split.
1x2[0]=99
2print(x2)
3print(a1)
[99 6 8 4]
[ 1 7 99 6 8 4 3 0 2 5]
We see that the parent array is modified as well.
ADVERTISEMENT
Splitting of 2-D Arrays
We first create a seeded 2-D random array.
1a2 = R.choice(3*4, size=(3,4), replace=False)
2print(a2)
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
Now, let’s split a2 into 2 equal sub-arrays using sections=2 along axis=1.
Example
Splitting a 2-D array into 2 equal parts alongaxis=1.
1y1, y2 = np.split(a2, 2, axis=1)
2print(y1)
3print(y2)
[[7 1]
[0 6]
[3 8]]
[[ 9 11]
[10 4]
[ 2 5]]
Let’s split a2 into 2 sub-arrays using indices=[1] along axis=0.
Example
Splitting a 2-D array into two sub-arrays alongaxis=0.
1z1, z2 = np.split(a2, [1], axis=0)
2print(z1)
3print(z2)
[[ 7 1 9 11]]
[[ 0 6 10 4]
[ 3 8 2 5]]
The split points need to be specified as a sorted array. The above will not work if we had used np.split(a2, 1, axis=0) rather than np.split(a2, [1], axis=0).
Splitting of 3-D Arrays
Splitting of 3-D arrays works in a similar manner. We first create a 3-D array by concatenating two 2-D random arrays.
Example
Create a 3-D array by concatenating two 2-D random arrays.1a21 = R.choice(3*4, size=(3,4), replace=False) # 2-D random array
2print(a21)
[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]
For the concatenation, we invoke the numpy.stack() function.
1a3 = np.stack((a2, a21),axis=0) # concatenation
2print(a3)
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
Now, let’s split a3 into 2 equal sub-arrays using sections=2 along axis=2.
Example
Splitting a 3-D array into 2 equal parts alongaxis=2.
1d1, d2 = np.split(a3, 2, axis=2)
2print("First sub-array:\n", d1, "\n")
3print("Second sub-array:\n",d2)
First sub-array:
[[[ 7 1]
[ 0 6]
[ 3 8]]
[[ 8 7]
[ 3 11]
[ 2 6]]]
Second sub-array:
[[[ 9 11]
[10 4]
[ 2 5]]
[[ 9 5]
[ 0 4]
[10 1]]]
For the above example, two 3-D sub-arrays are produced, each with shape (2,3,2). In fact, we can easily verify this fact using the shape attribute.
1print(a3.shape) # original 3-D array
2print(d1.shape) # sub-array 1
3print(d2.shape) # sub-array 2
(2, 3, 4)
(2, 3, 2)
(2, 3, 2)
From the above, we can easily see that the 3-D array has been split along axis=2 (third dimension).
ADVERTISEMENT
The numpy.array_split Function
A variation of the numpy.split function is that the numpy.array_split function that splits an array into multiple sub-arrays as views of the original array.
numpy.array_split is identical to numpy.split except that numpy.array_split allows indices_or_sections to be an integer that does not equally divide the axis.
In other words, the size of the dimension along where splitting occurs does not need to be divisible by the indices_or_sections parameter. For an array of length l that is to be split into N sections, it returns l%N sub-arrays of size l//n + 1 and the rest of size l//N. Refer to Python mathematical symbols.
The following example illustrates the difference.
Example
Splitting a 1-D array usingnumpy.split with sections=3.
1np.split(a1,3)
Traceback (most recent call last):
ValueError: array split does not result in an equal division
The above illustrates that numpy.split cannot handle an N value that results in unequal divisions.
Example
Splitting a 1-D array usingnumpy.array_split with sections=3.
1np.array_split(a1,3)
[array([1, 7, 9, 6]), array([8, 4, 3]), array([0, 2, 5])]
The numpy.hsplit Function
The numpy.hsplit function splits an array into multiple sub-arrays horizontally (column-wise).
Syntax
Thenumpy.hsplit function.
1numpy.hsplit(ary, indices_or_sections)
| Parameter | Required? | Default Value | Description |
|---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
numpy.hsplit is equivalent to numpy.split with axis=1 (second axis). The array is always split along the second axis except for 1-D arrays, where it is split along axis=0.
The advantage of using numpy.hsplit over numpy.split is that we do not need to specify the axis parameter.
Example
Splitting a 1-D array usingnumpy.hsplit with sections=2.
Let’s reuse our 1-D array a1 and perform a hsplit into two equal sub-arrays.
1print(a1)
2b1, b2 = np.hsplit(a1, 2)
3print(b1,b2)
[1 7 9 6 8] [4 3 0 2 5]
Note that the above can also be achieved using np.split(a1, 2).
Let’s now look at how numpy.hsplit works on a 2-D array. We shall reuse our 2-D array a2 and perform a hsplit into three sub-arrays using an array of indices.
Example
Splitting a 2-D array withnumpy.hsplit by specifying indices=[1,3].
1print(a2)
2c1, c2, c3 = np.hsplit(a2, [1,3])
3print(c1,c2,c3, sep='\n')
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[7]
[0]
[3]]
[[ 1 9]
[ 6 10]
[ 8 2]]
[[11]
[ 4]
[ 5]]
Note that the above is equivalent to using np.split(a2, [1, 3], axis=1) as demonstrated below.
Example
Splitting a 2-D array usingnumpy.split with axis=1.
1c1, c2, c3 = np.split(a2, [1, 3], axis=1)
2print(c1,c2,c3, sep='\n')
[[7]
[0]
[3]]
[[ 1 9]
[ 6 10]
[ 8 2]]
[[11]
[ 4]
[ 5]]
ADVERTISEMENT
The numpy.vsplit Function
The numpy.vsplit function splits an array into multiple sub-arrays vertically (row-wise). The function only works on arrays of 2 or more dimensions.
Syntax
Thenumpy.vsplit function.
1numpy.vsplit(ary, indices_or_sections)
| Parameter | Required? | Default Value | Description |
|---|---|---|---|
ary |
✔️ Yes | NA | Array to be divided into sub-arrays. |
indices_or_sections |
✔️ Yes | NA | If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. |
numpy.vsplit is equivalent to numpy.split with axis=0 (first axis). The array is always split along the first axis regardless of the array dimension.
Once again, the advantage of using numpy.vsplit over numpy.split is that we do not need to specify the axis parameter which is always taken to be the first axis.
Example
Splitting a 2-D array usingnumpy.vsplit with indices=[1,2].
We consider our 2-D array a2 and perform a vsplit into two three sub-arrays.
1print(a2)
2d1, d2, d3 = np.vsplit(a2, [1,2])
3print(d1,d2,d3, sep='\n')
[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[ 7 1 9 11]]
[[ 0 6 10 4]]
[[3 8 2 5]]
Note that the above can also be achieved using np.split(a2, [1,2], axis = 0).
Example
Splitting a 3-D array usingnumpy.vsplit with indices=[1].
We consider the 3-D array a3 and perform a vsplit into two two sub-arrays.
1print(a3) # 3-D array
2e1, e2 = np.vsplit(a3, [1]) # split along axis=0
3print("First sub-array:\n", e1, "\n")
4print("Second sub-array:\n",e2)
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]
[[8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
First sub-array:
[[[ 7 1 9 11]
[ 0 6 10 4]
[ 3 8 2 5]]]
Second sub-array:
[[[ 8 7 9 5]
[ 3 11 0 4]
[ 2 6 10 1]]]
- The above can also be achieved using
np.split(a3, [1], axis = 0). - Recall that the 3-D array
a3was previously concatenated from two 2-D arrays usingnp.stack. We now usenumpy.vsplitto split the 3-D array back into the two 2-D arrays.
numpy.vsplit does not always mean splitting vertically (row-wise). Instead, the split is always along axis=0 (first dimension). For the case of a 3-D array, this means along the frame axis.