Notable Site Recognition on Mobile Devices using Deep Learning with Crowd-sourced Imagery
In this work we design a mobile system that is able to automatically recognise sites of interest and project relevant information to a user that navigates the city. First, we build a collection of notable sites to bootstrap our system using Wikipedia. We then exploit online services such as Google Images and Flickr to collect large collections of crowdsourced imagery describing those sites. These images are then used to train minimal deep learning architectures that can effectively be transmitted and deployed to mobile devices becoming accessible to users through a dedicated application. Conducting an evaluation performing a series of online and real world experiments we enlist a number of key challenges that make the successful deployment of site recognition system difficult and highlight the importance of incorporating mobile contextual information to facilitate the visual recognition task. Similarity in the feature maps of objects that undergo identification, the presence of noise in crowdsourced imagery and arbitrary user induced inputs are among the factors the impede correct classification for deep learning models. We show how curating the training data through the application of a class-specific image denoising method and the incorporation of information such as user location, orientation and attention patterns can allow for significant improvement in classification accuracy and the election of a system that can effectively be used to recognise sites in the wild two out of three times.
READ FULL TEXT