Fig. 5From: Bilingual video captioning model for enhanced video retrievalOverall architecture of keyframe extractionBack to article page