Combining lexical and spatial knowledge to predict spatial relations between objects in images

Hürlimann, Manuela
Bos, Johan
Manuela Hürlimann, Johan Bos (2016) Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images The 5th Workshop on Vision and Language (VL'16)
Explicit representations of images are useful for linguistic applications related to images. We design a representation based on first-order models that capture the objects present in an image as well as their spatial relations. We take a supervised learning approach to the spatial relation classi- fication problem and study the effects of spatial and lexical information on prediction performance. We find that lexical information is required to accurately predict spatial relations when combined with location information, achieving an F-score of 0.80, compared to a most-frequent-class baseline of 0.62.
