Automated Query Expansion using High Dimensional Clustering

08/28/2018
by   Morgan Gallant, et al.
0

The exponential growth of information on the Internet has created a big challenge for retrieval systems in terms of yielding relevant results. This challenge requires automatic approaches for reformatting or expanding users' queries to increase recall. Query expansion (QE), a technique for broadening users' queries by appending additional tokens or phrases bases on semantic similarity metrics, plays a crucial role in overcoming this challenge. However, such a procedure increases computational complexity and may lead to unwanted noise in information retrieval. This paper attempts to push the state of the art of QE by developing an automated technique using high dimensional clustering of word vectors to create effective expansions with reduced noise. We implemented a command line tool, named Xu, and evaluated its performance against a dataset of news articles, concluding that on average, expansions generated using this technique outperform those generated by previous approaches, and the base user query.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2021

Dynamic Model for Query-Document Expansion towards Improving Retrieval Relevance

Getting relevant information from search engines has been the heart of r...
research
11/01/2018

Improving Information Retrieval Results for Persian Documents using FarsNet

In this paper, we propose a new method for query expansion, which uses F...
research
08/27/2019

A novel model for query expansion using pseudo-relevant web knowledge

In the field of information retrieval, query expansion (QE) has long bee...
research
04/23/2020

Natural language technology and query expansion: issues, state-of-the-art and perspectives

The availability of an abundance of knowledge sources has spurred a larg...
research
12/03/2010

Automated Query Learning with Wikipedia and Genetic Programming

Most of the existing information retrieval systems are based on bag of w...
research
09/10/2017

Improving average ranking precision in user searches for biomedical research datasets

Availability of research datasets is keystone for health and life scienc...
research
06/17/2017

Accelerating Innovation Through Analogy Mining

The availability of large idea repositories (e.g., the U.S. patent datab...

Please sign up or login with your details

Forgot password? Click here to reset