Versioned name: ExtractImagePatches-3
Category: Data movement
Short description: The ExtractImagePatches operation collects patches from the input tensor, as if applying a convolution. All extracted patches are stacked in the depth dimension of the output.
Detailed description:
The ExtractImagePatches operation is similar to the TensorFlow* operation ExtractImagePatches.
This op extracts patches of shape sizes
which are strides
apart in the input image. The output elements are taken from the input at intervals given by the rate
argument, as in dilated convolutions.
The result is a 4D tensor containing image patches with size size[0] * size[1] * depth
vectorized in the "depth" dimension.
The "auto_pad" attribute has no effect on the size of each patch, it determines how many patches are extracted.
Attributes
[size_rows, size_cols]
of the extracted patches.[stride_rows, stride_cols]
between centers of two consecutive patches in an input tensor.[rate_rows, rate_cols]
, specifying how far two consecutive patch samples are in the input. Equivalent to extracting patches with patch_sizes_eff = patch_sizes + (patch_sizes - 1) * (rates - 1)
, followed by subsampling them spatially by a factor of rates. This is equivalent to rate in dilated (a.k.a. Atrous) convolutions.Inputs
data
the 4-D tensor of type T with shape [batch, depth, in_rows, in_cols]
. Required.Outputs
[batch, size[0] * size[1] * depth, out_rows, out_cols]
with type equal to data
tensor. Note out_rows
and out_cols
are the dimensions of the output patches.Types
Example
Image is a 1 x 1 x 10 x 10
array that contains the numbers 1 through 100. We use the symbol x
to mark output patches.
sizes="3,3", strides="5,5", rates="1,1", auto_pad="valid"
x x x 4 5 x x x 9 10
x x x 14 15 x x x 19 20
x x x 24 25 x x x 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
x x x 54 55 x x x 59 60
x x x 64 65 x x x 69 70
x x x 74 75 x x x 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
output:
[[[[ 1 6] [51 56]]
[[ 2 7] [52 57]]
[[ 3 8] [53 58]]
[[11 16] [61 66]]
[[12 17] [62 67]]
[[13 18] [63 68]]
[[21 26] [71 76]]
[[22 27] [72 77]]
[[23 28] [73 78]]]]
output shape: [1, 9, 2, 2]
sizes="4,4", strides="8,8", rates="1,1", auto_pad="valid"
x x x x 5 6 7 8 9 10
x x x x 15 16 17 18 19 20
x x x x 25 26 27 28 29 30
x x x x 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
output:
[[[[ 1]]
[[ 2]]
[[ 3]]
[[ 4]]
[[11]]
[[12]]
[[13]]
[[14]]
[[21]]
[[22]]
[[23]]
[[24]]
[[31]]
[[32]]
[[33]]
[[34]]]]
output shape: [1, 16, 1, 1]
sizes="4,4", strides="9,9", rates="1,1", auto_pad="same_upper"
x x x x 0 0 0 0 0 x x x x
x x x x 4 5 6 7 8 x x x x
x x x x 14 15 16 17 18 x x x x
x x x x 24 25 26 27 28 x x x x
0 31 32 33 34 35 36 37 38 39 40 0 0
0 41 42 43 44 45 46 47 48 49 50 0 0
0 51 52 53 54 55 56 57 58 59 60 0 0
0 61 62 63 64 65 66 67 68 69 70 0 0
0 71 72 73 74 75 76 77 78 79 80 0 0
x x x x 84 85 86 87 88 x x x x
x x x x 94 95 96 97 98 x x x x
x x x x 0 0 0 0 0 x x x x
x x x x 0 0 0 0 0 x x x x
output:
[[[[ 0 0] [ 0 89]]
[[ 0 0] [ 81 90]]
[[ 0 0] [ 82 0]]
[[ 0 0] [ 83 0]]
[[ 0 9] [ 0 99]]
[[ 1 10] [ 91 100]]
[[ 2 0] [ 92 0]]
[[ 3 0] [ 93 0]]
[[ 0 19] [ 0 0]]
[[ 11 20] [ 0 0]]
[[ 12 0] [ 0 0]]
[[ 13 0] [ 0 0]]
[[ 0 29] [ 0 0]]
[[ 21 30] [ 0 0]]
[[ 22 0] [ 0 0]]
[[ 23 0] [ 0 0]]]]
output shape: [1, 16, 2, 2]
sizes="3,3", strides="5,5", rates="2,2", auto_pad="valid"
This time we use the symbols x
, y
, z
and k
to distinguish the patches:
x 2 x 4 x y 7 y 9 y
11 12 13 14 15 16 17 18 19 20
x 22 x 24 x y 27 y 29 y
31 32 33 34 35 36 37 38 39 40
x 42 x 44 x y 47 y 49 y
z 52 z 54 z k 57 k 59 k
61 62 63 64 65 66 67 68 69 70
z 72 z 74 z k 77 k 79 k
81 82 83 84 85 86 87 88 89 90
z 92 z 94 z k 97 k 99 k
output:
[[[[ 1 6] [ 51 56]]
[[ 3 8] [ 53 58]]
[[ 5 10] [ 55 60]]
[[ 21 26] [ 71 76]]
[[ 23 28] [ 73 78]]
[[ 25 30] [ 75 80]]
[[ 41 46] [ 91 96]]
[[ 43 48] [ 93 98]]
[[ 45 50] [ 95 100]]]]
output_shape: [1, 9, 2, 2]
sizes="2,2", strides="3,3", rates="1,1", auto_pad="valid"
Image is a 1 x 2 x 5 x 5
array that contains two feature maps where feature map with coordinate 0 contains numbers in a range [1, 25]
and feature map with coordinate 1 contains numbers in a range [26, 50]
x x 3 x x
6 7 8 x x
11 12 13 14 15
x x 18 x x
x x 23 x x
x x 28 x x
x x 33 x x
36 37 38 39 40
x x 43 x x
x x 48 x x
output:
[[[[ 1 4] [16 19]]
[[26 29] [41 44]]
[[ 2 5] [17 20]]
[[27 30] [42 45]]
[[ 6 9] [21 24]]
[[31 34] [46 49]]
[[ 7 10] [22 25]]
[[32 35] [47 50]]]]
output shape: [1, 8, 2, 2]