Object Recognition with Human in the Loop Intelligent Frameworks
Classifiers embedded within human in the loop visual object recognition frameworks commonly utilise two sources of information: one derived directly from the imagery data of an object, and the other obtained interactively from user interactions. These computer vision frameworks exploit human high-level cognitive power to tackle particularly difficult visual object recognition tasks. In this paper, we present innovative techniques to combine the two sources of information intelligently for the purpose of improving recognition accuracy. We firstly employ standard algorithms to build two classifiers for the two sources independently, and subsequently fuse the outputs from these classifiers to make a conclusive decision. The two fusion techniques proposed are: i) a modified naive Bayes algorithm that adaptively selects an individual classifier's output or combines both to produce a definite answer, and ii) a neural network based algorithm which feeds the outputs of the two classifiers to a 4-layer feedforward network to generate a final output. We present extensive experimental results on 4 challenging visual recognition tasks to illustrate that the new intelligent techniques consistently outperform traditional approaches to fusing the two sources of information.
READ FULL TEXT