Fig. 1From: Cross-modality representation learning from transformer for hashtag predictionAn Example of multimedia post from Instagram, where text offers limited information, without visual information, we can’t recommend the correct hashtagsBack to article page