2.1. Scenario 1 : Image from Caption