Fig. 4From: Image caption generation using Visual Attention Prediction and Contextual Spatial Relation ExtractionFew samples of failure captions generated by the proposed method. The descriptions given in blue and red represents the ground truth and generated captions, respectivelyBack to article page