Fig. 7From: Bilingual video captioning model for enhanced video retrievalArchitecture of similarity-based keyframe extraction phaseBack to article page