Semantic context driven language descriptions of videos using deep neural network

Journal of Big Data

Table 2 Evaluation of 2-layered and 3-layered LSTM in proposed framework using BLEU metrics

Models	MSVD
	2 Layer Stacked LSTM				3 Layer Stacked LSTM
	B@1	B@2	B@3	B@4	B@1	B@2	B@3	B@4
VGG16 + Stacked LSTM + GloVe (Model_1)	69.1	50.1	38.2	27.0	68.1	48.8	37.0	25.58
InceptionV3 + Stacked LSTM + GloVe (Model_2)	74.3	60.1	49.7	40.2	73.6	59.8	49.5	38.5
NASNet + Stacked LSTM + GloVe ( Model_3)	78.4	64.8	54.2	43.7	78.2	65.3	55.1	44