The i3d-rgb-tf
is a model for video classification, based on paper "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset". This model use RGB input stream and trained on Kinetics-400* dataset . Additionally, this model has initialize values from Inception v1 model pretrained on ImageNet* dataset.
Originally redistributed as a checkpoint file, was converted to frozen graph.
Metric | Value |
---|---|
Type | Action recognition |
GFLOPs | 278.981 |
MParams | 12.69 |
Source framework | TensorFlow* |
Accuracy validations performed on validation part of Kinetics-400* dataset. Subset consists of 400 randomly chosen videos from this dataset.
Metric | Converted Model | Converted Model (subset 400) |
---|---|---|
Top 1 | 65.96% | 67.0% |
Top 5 | 86.01% | 88.7% |
Video clip, name - Placeholder
, shape - 1,79,224,224,3
, format is B,D,H,W,C
, where:
B
- batch sizeD
- duration of input clipH
- heightW
- widthC
- channelChannel order is RGB
. Mean value - 127.5, scale value - 127.5.
Video clip, name - Placeholder
, shape - 1,79,3,224,224
, format is B,D,C,H,W
, where:
B
- batch sizeD
- duration of input clipC
- channelH
- heightW
- widthChannel order is RGB
.
Action classifier according to Kinetics-400* action classes, name - Softmax
, shape - 1,400
, format is B,C
, where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeAction classifier according to Kinetics-400* action classes, name - Softmax
, shape - 1,400
, format is B,C
, where:
B
- batch sizeC
- predicted probabilities for each class in [0, 1] rangeThe original model is distributed under the Apache License, Version 2.0. A copy of the license is provided in APACHE-2.0.txt.