Skip to main content
Fig. 4 | Journal of Big Data

Fig. 4

From: Image captioning model using attention and object features to mimic human image understanding

Fig. 4Fig. 4

Qualitative examples from MS COCO comparing the generated captions before and after using our object features method, trained on MS COCO (with the importance factor). a Baseline model: this is up in snow pants jumping on a big snowy mountain at night. With object features: a skier performing a jump against some snow. b Baseline model: two people on skis sitting on a snowy surface. With object features: a person standing next to snowboards attached. c Baseline model: a cow is standing in a open field as it grazes. With object features: cows eat alone grazing on grasses in a hill. d Baseline model: man walking next to an old fashioned planes. With object features: a small black and white picture of a prop plane sitting on the runway. e Baseline model: a brown bears perch in front of their mom and another animal. With object features: a brown bear is standing behind a group of brown bears. f Baseline model: two women make homemade my diners can be judged on a table. With object features: a group of people sitting at a blue table of food

Back to article page