Large-scale Generative Query Autocompletion
Query Autocompletion (QAC) systems are interactive tools that assist a searcher in entering a query given a partial query prefix. Existing QAC research — with a number of notable exceptions —relies upon large existing query logs from which to extract historical queries. These queries are then ordered by some ranking algorithm as candidate completions, given the query prefix.
Given the numerous search environments (e.g. enterprises, personal or secured data repositories) in which large query logs are unavailable, the need for synthetic — or generative — QAC systems will become increasingly important. Generative QAC systems may be used to augment traditional query-based approaches, and/or entirely replace them in certain privacy sensitive applications. Even in commercial Web search engines, a significant proportion (up to 15%) of queries issued daily have never been seen previously, meaning there will always be opportunity to assist users in formulating queries which have not occurred historically.
In this paper, we describe a system that can construct generative QAC suggestions within a user-acceptable timeframe (~58ms), and report on a series of experiments over three publicly available, large-scale question sets that investigate different aspects of the system's performance.