Score, rank fusion and reranking

What can be thought of as a simple query, may involve more steps than one can realise at first. Nuclia supports multiple search modes, rank fusion and reranking. Everything is exposed in the API to provide a custom search experience to satisfy your needs.

This document is a deep dive on how scoring, ranking and reranking work in Nuclia.

Concepts

First, let's dive into the general concepts.

Scoring documents

Nuclia supports multiple search modes, but we'll be focused in two: keyword and semantic search. Each of them provide a different search experience. On one hand, keyword is good for queries matching important words, but multilingual experience isn't particularly good. On the other hand, semantic search with an appropriate model is able to find content similar in meaning across languages and without matching words but for keyword-like queries may not be the best either. The truth is there's no best for everyone, it always depends on the use case. Usually, the combination of both is the key to finding the information you ask for.

Each search mode has a scoring mechanism and results in a ranked list of text blocks (paragraphs) sorted by best to worse. These scores make sense in their own search mode but are hard to compare with other search modes. Keyword search uses BM25 while semantic search uses a distance function between embeddings. Comparing scores directly is like comparing apples with oranges, they are fruits (scores) but choosing the best is not trivial.

Rank fusion

A rank fusion algorithm is one able to merge multiple ranked lists into a single one. Those algorithms are usually simple, fast and fall into one of the following categories: score-based or rank-based.

Score-based algorithms use each element score and produce a new unified score. When used, one is usually assuming comparable scores. Imagine for example two lists of keyword searches, both scored with BM25, one could simply merge and sort again.

Rank-based algorithms assume the best apple is comparable to the best orange, just because both are the best in their lists. This is really useful when the scores are unrelated, like BM25 and dot distance. A famous algorithm in this category is Reciprocal Rank Fusion (RRF), which produces a new score per element depending on the rank and, if some element matches in multiple lists, its score is added. That way, we can merge unrelated scores and boost matches in both search modes.

Reranking

Once we have a unified list of scored elements, there's an optional reranking step, where elements are revisited and reordered with some criteria. At Nuclia, we provide the option to use a cross-encoder model to rerank the results.

In simpler words, this will use the query and the results, compare each result with the query and provide a new score depending on how well the result answers the query. This is an expensive process (compared with rank fusion) but the results quality improves a lot. As we have been "blindly" merging elements from keyword and semantic search, reranking boosts the actual good results.

Windows

Sometimes, asking for 20 results and doing all these steps with 20 elements is not good enough. That's why at Nuclia we allow defining a window for rank fusion and reranking.

A window here means using more results than top_k to improve search quality. In RRF for example, a match appearing in multiple lists increases its score. Then, having longer lists will increase the chances of multi-match and thus, increase scores of some results.

In reranking a similar thing happens, if we choose the top 20 across 50 we'll have better results than simply reordering your already chosen top 20 results.

Increasing those windows has a cost, as we are processing more results for the same query, but provides better search experience and quality results.

Search tuning at Nuclia

Let's see how to use all these options to customize the search experience and get better results!

Starting with scoring/ranking, the features parameter on the search endpoints have the keyword and semantic options to trigger these search modes. Both are used by default so no action is usually needed.

Tuning rank fusion can be more tricky, as it depends on the dataset and the queries. The rank_fusion parameter in /find and /ask allows tuning rank fusion. RRF is the recommended algorithm and provides multiple options to customize the search experience, such as changing the k parameter or adding boosting/weights to certain search modes.

Finally, Nuclia's cross-encoder reranker is available using the reranker=predict option in /find and /ask endpoints.

Use case example: search across languages

Let's see how everything fits in an example.

Imagine you have a multilingual knowledge base and queries are usually done in a different language than your content. Keyword search won't provide the best results, although it can be useful for names, abbreviations... We'll then keep keyword and semantic search knowing that semantic will be better.

In the rank fusion step, as we are quite sure semantic will be usually better, we'll use RRF with semantic boosting. We'll double the scores of semantic results compared with keyword, so the end result contains more matches coming from semantic search.

Finally, we'll use a cross-encoder to rerank the final results and boost the best ones.

We want the 20 best results but use rank fusion on 80 and rerank the top 50, so we define the windows accordingly.

In an actual request, these are the parameters to use in /find or /ask:

{
    "features": ["semantic", "keyword"],
    "rank_fusion": {
        "name": "rrf",
        "boosting": {
            "semantic": 2,
        },
        "window": 80,
    },
    "reranker": {
        "name": "predict",
        "window": 50,
    },
    "top_k": 20,
    ...
}

Concepts​

Scoring documents​

Rank fusion​

Reranking​

Windows​

Search tuning at Nuclia​

Use case example: search across languages​