Shape Calculation Rules for Pooling Operators#
Mathematical Formulation
Output shape calculation based on auto_pad and rounding_type:
auto_pad = explicitandrounding_type = floorH_out = floor((H + pads_begin[0] + pads_end[0] - ((kernel[0] - 1) * dilations[0] + 1)) / strides[0] + 1)W_out = floor((W + pads_begin[1] + pads_end[1] - ((kernel[1] - 1) * dilations[1] + 1)) / strides[1] + 1)D_out = floor((D + pads_begin[2] + pads_end[2] - ((kernel[2] - 1) * dilations[2] + 1)) / strides[2] + 1)
auto_pad = explicitandrounding_type = ceilH_out = ceil((H + pads_begin[0] + pads_end[0] - ((kernel[0] - 1) * dilations[0] + 1)) / strides[0] + 1)W_out = ceil((W + pads_begin[1] + pads_end[1] - ((kernel[1] - 1) * dilations[1] + 1)) / strides[1] + 1)D_out = ceil((D + pads_begin[2] + pads_end[2] - ((kernel[2] - 1) * dilations[2] + 1)) / strides[2] + 1)
auto_pad = valid- Please note that AvgPool does not support
dilationsattribute, in wchich case its value should be replaced with1. H_out = ceil((H - ((kernel[0] - 1) * dilations[0] + 1) + 1) / strides[0])W_out = ceil((W - ((kernel[1] - 1) * dilations[1] + 1) + 1) / strides[1])D_out = ceil((D - ((kernel[2] - 1) * dilations[2] + 1) + 1) / strides[2])
- Please note that AvgPool does not support
auto_pad = same_upper / same_lowerH_out = HW_out = WD_out = D
If H + pads_begin[i] + pads_end[i] - kernel[i] is not divisible by strides[i] evenly, the result is rounded with respect to the rounding_type attribute.
If rounding_type is set to ceil_torch, the last pooling operation within a dimension cannot start in the padding area. If this is the case, the respective dimension is reduced by 1. More context can be found in the PyTorch issue discussion.
Examples
Example 1 shows how MaxPool operates with 4D input using 2D kernel and
auto_pad = explicit.input = [[[[-1, 2, 3], [4, 5, -6], [-7, 8, 9]]]] # shape: (1, 1, 3, 3) strides = [1, 1] pads_begin = [1, 1] pads_end = [1, 1] kernel = [2, 2] rounding_type = "floor" auto_pad = "explicit" output0 = [[[[-1, 2, 3, 3], [4, 5, 5, -6], [4, 8, 9, 9], [-7, 8, 9, 9]]]] # shape: (1, 1, 4, 4) output1 = [[[[0, 1, 2, 2], [3, 4, 4, 5], [3, 7, 8, 8], [6, 7, 8, 8]]]] # shape: (1, 1, 4, 4)
Example 2 shows how MaxPool operates with 3D input using 1D kernel and
auto_pad = valid.input = [[[-1, 2, 3, 5, -7, 9, 1]]] # shape: (1, 1, 7) strides = [1] kernel = [3] rounding_type = "floor" auto_pad = "valid" output0 = [[[3, 5, 5, 9, 9]]] # shape: (1, 1, 5) output1 = [[[2, 3, 3, 5, 5]]] # shape: (1, 1, 5)
Example 3 shows how MaxPool operates with 4D input using 2D kernel and
auto_pad = same_lower.input = [[[[-1, 2, 3], [4, 5, -6], [-7, 8, 9]]]] # shape: (1, 1, 3, 3) strides = [1, 1] kernel = [2, 2] rounding_type = "floor" auto_pad = "same_lower" output0 = [[[[-1, 2, 3], [4, 5, 5] [4, 8, 9]]]] # shape: (1, 1, 3, 3) output1 = [[[[0, 1, 2], [3, 4, 4], [3, 7, 8]]]] # shape: (1, 1, 3, 3)
Example 4 shows how MaxPool operates with 4D input using 2D kernel and
auto_pad = same_upper.input = [[[[-1, 2, 3], [4, 5, -6], [-7, 8, 9]], [[2, -1, 5], [6, -7, 1], [8, 2, -3]]]] # shape: (1, 2, 3, 3) strides = [1, 1] kernel = [2, 2] rounding_type = "floor" auto_pad = "same_upper" output0 = [[[[5, 5, 3], [8, 9, 9] [8, 9, 9]], [[6, 5, 5], [8, 2, 1], [8, 2, -3]]]] # shape: (1, 2, 3, 3) output1 = [[[[4, 4, 2], [7, 8, 8], [7, 8, 8]], [[12, 11, 11], [15, 16, 14], [15, 16, 17]]]] # shape: (1, 2, 3, 3)
Example 5 shows how MaxPool operates with 4D input using 2D kernel and
rounding_type = ceil_torch.input = [[[[1, 2, 3], [4, 5, 6], [7, 8, 9]]]] # shape: (1, 1, 3, 3) strides = [2, 2] kernel = [2, 2] pads_begin = [1, 1] pads_end = [1, 1] rounding_type = "ceil_torch" output0 = [[[[1, 3], [7, 9]]]] # shape: (1, 1, 2, 2) output1 = [[[[0, 2], [6, 8]]]] # shape: (1, 1, 2, 2)
Example 6 shows how MaxPool operates with 4D input using 2D kernel,
auto_pad = validandrounding_type = ceil.input = [[[[-1, 2, 3], [4, 5, -6], [-7, 8, 9]]]] # shape: (1, 1, 3, 3) strides = [2, 2] kernel = [2, 2] rounding_type = "ceil" auto_pad = "valid" output0 = [[[[5, 3], [8, 9]]]] # shape: (1, 1, 2, 2) output1 = [[[[4, 2], [7, 8]]]] # shape: (1, 1, 2, 2)
Example 7 shows how MaxPool operates on 4D input using dilated 2D kernel,
auto_pad = explicitandrounding_type = floor.input = [[[[1, 2, 3], [4, 5, 6], [7, 8, 9]]]] # shape: (1, 1, 3, 3) strides = [1, 1] kernel = [2, 2] dilations = [2, 2] rounding_type = "floor" auto_pad = "explicit" pads_begin = [1, 1] pads_end = [1, 1] output0 = [[[[5, 6, 5], [8, 9, 8], [5, 6, 5]]]] # shape: (1, 1, 3, 3) output1 = [[[[4, 5, 4], [7, 8, 7], [4, 5, 4]]]] # shape: (1, 1, 3, 3)
Example 8 shows how MaxPool operates on 4D input using 2D kernel, with non-default
axisvalue.
Input shape: (1, 2, 3, 3) Output shape: (1, 2, 2, 2)
input = [[[[1, 2, 3], [4, 5, 6], [7, 8, 9]], [[10, 11, 12], [13, 14, 15], [16, 17, 18]]]] # shape: (1, 2, 3, 3) strides = [1, 1] kernel = [2, 2] dilations = [1, 1] rounding_type = "floor" auto_pad = "explicit" pads_begin = [0, 0] pads_end = [0, 0] axis = 2 output0 = [[[[5, 6], [8, 9]], [[14, 15], [17, 18]]]] # shape: (1, 2, 2, 2) output1 = [[[[4, 5], [7, 8]], [[4, 5], [7, 8]]]] # shape: (1, 2, 2, 2)