smartlab-sequence-modelling-0001¶
Use Case and High-Level Description¶
This is an online action segmentation network for 16 classes trained on Intel dataset. It is an online version of MSTCN++. The difference between online MSTCN++ and MSTCN++ is that the former accept stream video as input while the latter assume the whole video is given.
For the original MSTCN++ model details see paper
Specification¶
Metric |
Value |
---|---|
GOPs |
0.048915 |
MParams |
1.018179 |
Source framework |
PyTorch* |
Accuracy¶
Notice: In the accuracy report, feature extraction network is i3d-rgb, you can get this model from @ref omz_models_model_i3d_rgb_tf
.
Inputs¶
The inputs to the network are feature vectors at each video frame, which should be the output of feature extraction network, such as i3d-rgb-tf and resnet-50-tf, and feature outputs of the previous frame.
You can check the i3d-rgb and smartlab-sequence-modelling-0001 usage in demos/smartlab_demo
Input feature, name:
input
, shape:1, 2048, 24
, format:B, W, H
, where:B
- batch sizeW
- feature map widthH
- feature map height
History feature 1, name:
fhis_in_0
, shape:12, 64, 2048
, format: C, H, W,
History feature 2, name:
fhis_in_1, shape:
11, 64, 2048, format:
C, H’, W,
History feature 3, name:
fhis_in_2, shape:
11, 64, 2048, format:
C, H’, W,
History feature 4, name:
fhis_in_3, shape:
11, 64, 2048, format:
C, H’, W`, where:C
- the channel number of feature vectorH
- feature map heightW
- feature map width
Outputs¶
The outputs also include two parts: predictions and four feature outputs. Predictions is the action classification and prediction results. Four Feature maps are the model layer features in past frames.
Prediction, name:
output
, shape:4, 1, 64, 24
, format:C, B, H, W
,C
- the channel number of feature vectorB
- batch sizeH
- feature map heightW
- feature map width After post-process with argmx() function, the prediction result can be used to decide the action type of the current frame.
History feature 1, name:
fhis_out_0
, shape:12, 64, 2048
, format:C, H, W
,History feature 2, name:
fhis_out_1
, shape:11, 64, 2048
, format:C, H, W
,History feature 3, name:
fhis_out_2
, shape:11, 64, 2048
, format:C, H, W
,History feature 4, name:
fhis_out_3
, shape:11, 64, 2048
, format:C, H, W
, where:C
- the channel number of feature vectorH
- feature map heightW
- feature map width
Legal Information¶
[*] Other names and brands may be claimed as the property of others.