Tese

Applying machine learning to relevance evidence fusion at indexing time

The production of high quality ranking results is the main goal of web search engines. An important aspect of modern search engines is the use of a large number of distinct sources of relevance evidence to build the learning to rank (L2R) model. Collectively, they determine whether the document is...

ver descrição completa

Autor principal: Silva, Sheila da Nóbrega
Outros Autores: http://lattes.cnpq.br/4773539555154519, https://orcid.org/0000-0003-3282-1447
Grau: Tese
Idioma: por
Publicado em: Universidade Federal do Amazonas 2020
Assuntos:
Acesso em linha: https://tede.ufam.edu.br/handle/tede/7914
Resumo:
The production of high quality ranking results is the main goal of web search engines. An important aspect of modern search engines is the use of a large number of distinct sources of relevance evidence to build the learning to rank (L2R) model. Collectively, they determine whether the document is relevant to a query or not. The ranking of query results is computed by fusing all sources of evidence into a single document score, for each document in the final ranking. In the past few decades, most of the works on evidence fusion has been done with the implementation of L2R methods. L2R methods use examples of queries and their respective results to train supervised learning models that determine the relative position of the documents in the result list. Once trained, the model can be used during query processing to determine the final ranking. This approach, however, inadvertently adds computational costs to query processing, which may lead to a drop in time performance. To mitigate this problem, an alternative approach was proposed in literature — Learn to Precompute Evidence Fusion (LePrEF), based on supervised learning techniques with GP (Genetic Programming). LePrEF proposes to implement the bulk of the evidence fusion during indexing time, generating a single inverted index containing unified entries representing all sources of evidence. These unified entries are called Unified Term Impacts (UTIs). Each unified term impact replaces several features with a single value in the document index, thereby reducing the effort to compute the document scores at query processing time because the system fetches and processes fewer values. The adoption of UTI values produces competitive ranking results. However, the lack of features available only at query time might lead to accuracy loss. In this dissertation we study and propose a modified LambdaMART, named UTI-LambdaMART, a gradient boosting algorithm to generate unified term impacts (UTI) values at indexing time. We also propose and evaluate a hybrid model that uses UTI values with query-dependent features. We demonstrate that our hybrid methods can deliver high-quality results on par with those of the existing state-of-the-art neural ranking models. The experimental results show that our best hybrid model, HLambdaMART, achieves an NDCG@10 value of 0.495 using only 36 features at query processing time when applied to the MQ2007 collection, while the best baseline achieves 0.490 using a larger set of features at query processing time. The use of our hybrid framework reduces the time to run LambdaMART to about 35% of the time to run it without using our proposals. In addition, we study and propose a simple method to obtain significant gains in UTI-index compression with virtually no loss in the quality of search results. Our approach was able to achieve 79% compression rate of the index, while keeping the quality of results on par with methods that do not use compression. We also conduct experiments that demonstrate the use of the UTI-LambdaMART as a base ranker.