Skip to main content

Table 7 Comparison of proposed framework with the state-of-the-art methods w.r.t METEOR and CIDEr score

From: Semantic context driven language descriptions of videos using deep neural network

Method

METEOR

CIDEr

S-VC [48]

29.3

–

SA [49]

29.6

51.7

S2VT [57]

29.2

–

S2VT[VGGNet+Optical flow] [57]

29.8

–

MM-VDN [50]

29.0

–

MP-LSTM [9]

29.1

–

LSTM-E[VGGNet] [51]

29.5

–

LSTM-E[C3D] [51]

29.9

–

LSTM-E[VGGNet+C3D] [51]

31.0

–

LSTM-GAN [8]

30.4

–

p-RNN[C3D] [11]

30.3

–

p-RNN[VGGNet] [11]

31.1

–

LVMVP [53]

29.9

51.1

BPLSTM [55]

32.0

62.20

HRNE [12]

32.1

–

HBNEVC [52]

–

63.5

SE-GRU [54]

–

62.3

STAT [58]

–

67.5

MA-LSTM [29]

–

70.4

UTS [56]

33.20

71.10

STAT_LOC_V [10]

30.5

62.8

STAT_LOC_L [10]

31.0

62.5

Model_3 (Proposed)

32.3

70.7