Foto 7

ViSketch-GPT: Collaborative Multi-Scale Feature Extraction for Hand-Drawn Sketch Retrieval.

Written by Giulio Federico, Fabio Carrara, Claudio Gennaro, Marco Di Benedetto

Understanding the nature of hand-drawn sketches is challenging due to the wide variation in their creation. Federico et al. demonstrated that recognizing complex structural patterns enhances both sketch recognition and generation. Building on this foundation, we explore how the extracted features can also be leveraged for hand-drawn sketch retrieval. In this work, we extend ViSketch-GPT, a multiscale context extraction model to the task of retrieval. The model’s ability to capture intricate details at multiple scales allows it to learn highly discriminative representations, making it well-suited for retrieval applications.Through extensive experiments on the QuickDraw and TU-Berlin datasets, we show that ViSketch-GPT surpasses state-of-the-art methods in sketch retrieval, achieving substantial improvements across multiple evaluation metrics. Our results show that the extracted feature representations,originally designed for classification and generation, are also highly effective for retrieval tasks.