From: Semantic context driven language descriptions of videos using deep neural network
Method | B@1 | B@2 | B@3 | B@4 |
---|---|---|---|---|
S-VC [48] | – | – | – | 35.1 |
SA [49] | – | – | – | 40.3 |
MM-VDN [50] | – | – | – | 37.6 |
LSTM-E [51] | 74.9 | 60.9 | 50.6 | 40.2 |
HBNEVC [52] | – | – | – | 42.5 |
LVMVP [53] | – | – | – | 40.1 |
LSTM-GAN [8] | – | – | – | 42.9 |
SE-GRU [54] | – | – | – | 42.9 |
BPLSTM [55] | 78.4 | 64.8 | 53.8 | 42.9 |
UTS [56] | – | – | – | 43.00 |
STAT_LOC_V [10] | – | – | – | 43.2 |
STAT_LOC_L [10] | – | – | – | 42.9 |
p-RNN(VGGNet) [11] | 77.3 | 64.5 | 54.6 | 44.3 |
Model_3 (Proposed) | 78.4 | 64.8 | 54.2 | 43.7 |