Pad the arrays effectively using numpy.pad()

Have you ever come across a situation where you have to add a border to an array or a tensor? Wondering how to go about it? The answer is to use the numpy.pad()

The NumPy module has a function named pad() that can be used to pad certain values to the array. In this article, let us discuss the numpy.pad() function and also explore different ways of using it. 

What is Padding an Array?

Padding means adding some contents to the border of an array. After padding the array, the array would have a different size. But same dimensions.

Have a look at the pictorial representation below, 

 

numpy.pad() example

 

This is a very simple example where an array of 1s is padded with 0s along its border. Note that before padding, the size of the array was 3×3. After padding, the array size has changed to 5×5. But the dimension(or rank)  remains the same. That is 2 in this case.

numpy.pad() function can be used to perform the padding on an array. Note that padding can just be on the top or bottom or sides. Also, instead of zeros, you can pad with a scalar value or arrays. There are a lot of variations. To use these variations, let us understand the syntax of the function numpy.pad().

Syntax of numpy.pad() along with necessary examples 

numpy.pad(array, pad_width, mode='constant',**kwargs)

This function returns a padded array with the same dimension as that of the input array with different sizes.

Parameters: 

array – is an input array. It can be a ndarray object or array-like object (list, tuple, etc)

pad_width – This parameter specifies the number of values to be padded along each axis. This can be a sequence or array-like or int value. Let us try to understand different ways of specifying pad_width parameter with examples

pad_width can be specified in the format padwidth=((before_axis0,after_axis0),(before_axis1,after_axis1),………,(before_axisN,after_axisN)

 

For example, when the array has two dimensions, the pad_width can be specified as follows: 

padwidth=((before_axis0,after_axis0),(before_axis1,after_axis1))

 

before_axis0 → number of rows to be padded before the array elements.

after_axis0 → number of rows to be padded after the array elements. 

 

before_axis1 → number of columns to be padded before the array elements.

after_axis1 → number of columns to be padded after the array elements.

 

numpy.pad()

 

Example 1 : 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,((1,2),(1,3)))

Output:

array([[0, 0, 0, 0, 0, 0, 0, 0],
      [0, 1, 1, 1, 1, 0, 0, 0],
      [0, 1, 1, 1, 1, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0]])

As noticed from the above example, 1 row is added before the array elements and 2 rows are added after the array elements. Also, 1  column is added before the array elements and 3 columns are added after the array elements.

pad_width can be specified in the format padwidth=(before_axes, after_axes) 

 

For example, when the array has two dimensions, the pad_width can be specified as follows: 

before_axes → number of rows and columns to be padded before the array elements.

After_axes → number of rows and columns to be padded after the array elements.

 

 

Example 2: 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,(1,2))

Output:

array([[0, 0, 0, 0, 0, 0, 0],
      [0, 1, 1, 1, 1, 0, 0],
      [0, 1, 1, 1, 1, 0, 0],
      [0, 0, 0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0]])

As seen from the above example, 1 row and 1 column are added before the array elements and 2 rows and two columns are added after the array elements.

pad _width can also be specified as an int as padwidth=(pad,) or just padwidth=pad

For example, when the array has two dimensions, and when the same number of values must be padded along all the axes, pad_width can be specified as an int value.

 

 

Example 3: 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,2)

Output:

array([[0, 0, 0, 0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0],
      [0, 0, 1, 1, 1, 1, 0, 0],
      [0, 0, 1, 1, 1, 1, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0]])

As seen from the above example, 2 rows and 2 columns are added before and after the array elements.

mode This is an optional parameter. The mode can have one of the following values 

“constant”By default, the value of mode is constant. It pads with a constant value.

Example 4:

arr=np.array([2,4,6,8])
np.pad(arr,pad_width=2,mode="constant")

Output:

array([0, 0, 2, 4, 6, 8, 0, 0])

As seen in the above example, a constant is padded.

“edge”Pads with the edge values of an array.

 

Example 5: 

arr=np.array([2,4,6,8])
np.pad(arr,pad_width=2,mode="edge")

Output:

array([2, 2, 2, 4, 6, 8, 8, 8])

As seen in the above example, the edge elements of array 2,4 are padded.

 

“linear_ramp”Pads with the linear ramp between end_value and the array edge value.

Example 6: 

arr=np.array([2,4,6,8])
np.pad(arr,pad_width=2,mode="linear_ramp")

Output:

array([0, 1, 2, 4, 6, 8, 4, 0])

Note that when no ending value is specified 0 is considered as the ending value. So, the array is padded with 0 as the last element. Elements between the edge element of the array i.e 8 and the edge element of the padded array i.e 0 will have elements that have a linear relation. For example 8-4 =4 and 4-4=0.

Similarly, on the other side, the edge element of the array is 2 and the edge element of the padded array is 0. The elements in between have a linear relation. That is, 2-1=1 and 1-1=0 

 

“maximum”Pads with the maximum value of all or part of the vector along each axis.

Example 7: 

arr=np.array([2,4,6,8])
np.pad(arr,pad_width=2,mode="maximum")

Output:

array([8, 8, 2, 4, 6, 8, 8, 8])

Here, the array is padded with 8 which is the maximum element of the array.

 

Example 8: Let us consider another example to see how this function works with a multi-dimensional array.

arr=np.array([[2,4,6,8],[9,8,7,6]])
np.pad(arr,pad_width=1,mode="maximum")

Output:

array([[9, 9, 8, 7, 8, 9],
      [8, 2, 4, 6, 8, 8],
      [9, 9, 8, 7, 6, 9],
      [9, 9, 8, 7, 8, 9]])

In cases like this, the maximum value is padded along each axis. 

Along axis 0 – the maximum value in a vector is considered. Thus,

[2,4,6,8] → [8,2,4,6,8,8]

[9,8,7,6] → [9,9,8,7,6,9]

 

Along axis 1 – the maximum value in a vector is considered. Thus, a maximum among 8,9 is padded at the top and bottom of the first column. Maximum among 2,9 is padded at the top and bottom of the second column and so on.

 

“mean”Pads with the mean value of all or part of the vector along each axis.

Example 9: 

arr=np.array([2,4,6,8])
np.pad(arr,pad_width=2,mode="mean")

Output:

array([5, 5, 2, 4, 6, 8, 5, 5])

Evidently, 5 is the mean of all the elements of the array ( (2+4+6+8)/4 ) and the array is padded with the mean.

“median”Pads with the median value of all or part of the vector along each axis. Similar to the above examples but returns a median instead.

“minimum”Pads with the minimum value of all or part of the vector along each axis. Similar to the above examples but returns a minimum value instead.

 

“reflect”Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.

Example 10: 

arr = [0,2,4,6,8]
np.pad(arr, (2, 2), 'reflect')

Output:

array([4, 2, 0, 2, 4, 6, 8, 6, 4])

Note that the reflection is mirrored on the first and last element. Refer to the below image for details.

numpy.pad() with reflect

 

Example 11: Let us consider another example, where the pad width is 5 and check what happens,

arr=[0,2,4,6,8]
np.pad(arr, (5, 0), 'reflect')

Output:

array([6, 8, 6, 4, 2, 0, 2, 4, 6, 8])

 

Evidently, it goes till the end of the array and then traverses backward. Refer to the below image for details.

 

numpy.pad() with return reflect

 

“symmetric”Pads with the reflection of the vector mirrored along the edge of the array.

 

Example 12: 

arr=[0,2,4,6,8]
np.pad(arr, (2, 2), 'symmetric')

Output:

array([2, 0, 0, 2, 4, 6, 8, 8, 6])

Note that, when the mode is symmetric, the vector is mirrored along the edge. Refer to the below image for a better understanding

 

“wrap”Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.

Example 13:

arr=[0,2,4,6,8,10]
np.pad(arr, (3, 2), 'wrap')

Output:

array([ 6,  8, 10,  0,  2,  4,  6,  8, 10,  0,  2])

Note that the 3 elements are to be padded before the array elements and these elements are picked from the end of the array i.e 6,8,10. 2 elements are padded at the end of the array and these are picked from the beginning of the array.

 

“empty”- Pads with undefined values.

 

**kwargs – This is an optional parameter. **kwargs accepts key-value pair and the key corresponds to the mode used.

 

When the mode is constant, the key to be used is constant_values

constant_values can be a sequence containing unique pad constants along each axis, 

((before_axis0,after_axis0),(before_axis1,after_axis1),………,(before_axisN,after_axisN)

NOTE: The row elements are padded and then replaced with the column elements.

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,((1,2),(1,3)),mode="constant",constant_values=((3,4),(0,2)))

Output:

array([[0, 3, 3, 3, 3, 2, 2, 2],
      [0, 1, 1, 1, 1, 2, 2, 2],
      [0, 1, 1, 1, 1, 2, 2, 2],
      [0, 4, 4, 4, 4, 2, 2, 2],
      [0, 4, 4, 4, 4, 2, 2, 2]])

As seen in the above example, 3 is padded along axis 0 before the array elements and 4 is padded along axis 0 after the array elements. Likewise, 0 and 2 are padded along axis 1 before and after the array elements respectively. Note that 3 is replaced with 0 at the intersection point (arr[0][0]) and 3 is replaced with 2 at the intersection point s(arr[0][5],arr[0][6],arr[0][7])

constant_values can be a sequence containing unique pad constants for all the axes as  (before_axes,after_axes). 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,((1,2),(1,3)),mode="constant",constant_values=(3,4))

Output:

array([[3, 3, 3, 3, 3, 4, 4, 4],
      [3, 1, 1, 1, 1, 4, 4, 4],
      [3, 1, 1, 1, 1, 4, 4, 4],
      [3, 4, 4, 4, 4, 4, 4, 4],
      [3, 4, 4, 4, 4, 4, 4, 4]])

As seen in the above example, 3 is padded before the array elements and 4 is padded after the array elements.

constant_values can be a constant value that is padded for all axes. 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,((1,2),(1,3)),mode="constant",constant_values=10)

Output:

array([[10, 10, 10, 10, 10, 10, 10, 10],
      [10,  1,  1,  1,  1, 10, 10, 10],
      [10,  1,  1,  1,  1, 10, 10, 10],
      [10, 10, 10, 10, 10, 10, 10, 10],
      [10, 10, 10, 10, 10, 10, 10, 10]])

By default, the value is 0.

 

arr=np.array([[1,1,1,1],[1,1,1,1]])
np.pad(arr,((1,2),(1,3)),mode="constant")

Output

array([[0, 0, 0, 0, 0, 0, 0, 0],
      [0, 1, 1, 1, 1, 0, 0, 0],
      [0, 1, 1, 1, 1, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0, 0]])

Note that when no values are passed, by default the array is padded with 0.

When the mode is maximum, minimum, median, mean, the key to be used is  stat_length

stat_length can be a sequence containing unique statistic length along each axis, 

((before_axis0,after_axis0),(before_axis1,after_axis1),………,(before_axisN,after_axisN)

Or can be a sequence containing unique statistic length for all the axes as  (before_axes,after_axes). Or can be a constant statistic length value that is padded for all axes. By default, the value is 0

 

When the mode is linear_ramp, the key to be used is end_values

end_values can be a sequence containing the ending values of the linear_ramp that will form the edge of the padded array. ((before_axis0,after_axis0),(before_axis1,after_axis1),.,(before_axisN,after_axisN))

Or can be a sequence containing values for the edge elements along all the axes as  (before_axes,after_axes) Or end_values can be a constant value that should be present as the edge element. By default, the value is 0.

 

When the mode is reflected or symmetric, the key to be used is reflect_type

reflect_type can either be even or odd. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.

arr=[0,2,4,6,8]
np.pad(arr, (2, 2), 'symmetric',reflect_type='odd')

Output:

array([-2,  0,  0,  2,  4,  6,  8,  8, 10])

Conclusion:

In this article, we have discussed the Syntax and basic examples using numpy. pad(). We hope this article has been informative. Thanks for reading and Happy Pythoning!

If you enjoyed reading, share this article.

Anusha Pai is a Software Engineer having a long experience in the IT industry and having a passion to write. She has a keen interest in writing Python Errorfixes, Solutions, and Tutorials.

Leave a Comment