Skip to main content

Table 4 Evaluation of 2-layered and 3-layered LSTM in proposed framework using METEOR, ROUGE, CIDEr and SPICE

From: Semantic context driven language descriptions of videos using deep neural network

Models

MSVD

2 Layer Stacked LSTM

3 Layer Stacked LSTM

METEOR

ROUGE

CIDEr

SPICE

METEOR

ROUGE

CIDEr

SPICE

VGG16 + Stacked LSTM + GloVe (Model_1)

24.7

60.7

32.4

3

24.1

60.9

29.6

3

InceptionV3 + Stacked LSTM + GloVe (Model_2)

33.3

66.6

58.4

4.8

31.1

67.0

64.4

4.9

NASNet + Stacked LSTM + GloVe ( Model_3)

32.3

68.8

70.7

5.1

31.8

67.5

71.4

4.9