Foto 7

N. Messina, G. Amato, F. Carrara, F. Falchi, C. Gennaro: "Learning Relationship-aware Visual Features". In Computer Vision – ECCV 2018 Workshops. (Vol. 4, pp. 486–501). Springer, Cham

Written by

Abstract:

Relational reasoning in Computer Vision has recently shown impressive results on visual question answering tasks. On the challenging dataset called CLEVR, the recently proposed Relation Network (RN), a simple plug-and-play module and one of the state-of-the-art approaches, has obtained a very good accuracy (95.5%) answering relational questions. In this paper, we define a sub-field of Content-Based Image Retrieval (CBIR) called Relational-CBIR (R-CBIR), in which we are interested in retrieving images with given relationships among objects. To this aim, we employ the RN architecture in order to extract relationaware features from CLEVR images. To prove the effectiveness of these features, we extended both CLEVR and Sort-of-CLEVR datasets generating a ground-truth for R-CBIR by exploiting relational data embedded into scene-graphs. Furthermore, we propose a modification of the RN module–a two-stage Relation Network (2S-RN)–that enabled us to extract relation-aware features by using a preprocessing stage able to focus on the image content, leaving the question apart. Experiments show that our RN features, especially the 2S-RN ones, outperform the RMAC state-of-the-art features on this new challenging task.

Keywords: CLEVR, Content-Based Image Retrieval, Deep Learning, Relational Reasoning, Relation Networks, Deep Features

File: http://openaccess.thecvf.com/content_eccv_2018_workshops/w23/html/Messina_Learning_Relationship-aware_Visual_Features_ECCVW_2018_paper.html