Much progress has been made in the field of sentiment analysis in the past years. Researchers relied on textual data for this task, while only recently they have started in- vestigating approaches to predict sentiments from multime- dia content. With the increasing amount of data shared on social media, there is also a rapidly growing interest in ap- proaches that work “in the wild”, i.e. that are able to deal with uncontrolled conditions. In this work, we faced the challenge of training a visual sentiment classifier starting from a large set of user-generated and unlabeled contents. In particular, we collected more than 3 million tweets con- taining both text and images, and we leveraged on the sen- timent polarity of the textual contents to train a visual sen- timent classifier. To the best of our knowledge, this is the first time that a cross-media learning approach is proposed and tested in this context. We assessed the validity of our model by conducting comparative studies and evaluations on a benchmark for visual sentiment analysis. Our empir- ical study shows that although the text associated to each image is often noisy and weakly correlated with the image content, it can be profitably exploited to train a deep Con- volutional Neural Network that effectively predicts the sen- timent polarity of previously unseen images.
File: https://www.researchgate.net/publication/319316424_Cross-Media_Learning_for_Image_Sentiment_Analysis_in_the_Wild