Semantic context driven language descriptions of videos using deep neural network

Journal of Big Data

Table 4 Evaluation of 2-layered and 3-layered LSTM in proposed framework using METEOR, ROUGE, CIDEr and SPICE

Models	MSVD
	2 Layer Stacked LSTM				3 Layer Stacked LSTM
	METEOR	ROUGE	CIDEr	SPICE	METEOR	ROUGE	CIDEr	SPICE
VGG16 + Stacked LSTM + GloVe (Model_1)	24.7	60.7	32.4	3	24.1	60.9	29.6	3
InceptionV3 + Stacked LSTM + GloVe (Model_2)	33.3	66.6	58.4	4.8	31.1	67.0	64.4	4.9
NASNet + Stacked LSTM + GloVe ( Model_3)	32.3	68.8	70.7	5.1	31.8	67.5	71.4	4.9