A novel Multi-Layer Attention Framework for visual description prediction using bidirectional LSTM

Journal of Big Data

Table 1 Performance of proposed models on MSVD and MSRVTT Dataset

Model	MSVD					MSR VTT
Model	B@1	B@2	B@3	B@4	METEOR	B@1	B@2	B@3	B@4	METEOR
Base model	66.01	49.42	38.69	27.19	48.14	57.75	37.49	29.50	16.05	36.25
Base model with BN	62.07	40.28	27.07	16.61	39.30	63.09	38.84	26.99	14.02	35.82
Stacked LSTM	67.49	51.98	41.90	31.23	49.19	58.18	41.41	32.02	17.61	37.88
Multi-layer attention(Proposed)	70.50	56.62	49.60	33.07	51.77	60.33	43.72	34.12	19.61	39.47