Jurek Leonhardt, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand
L3S Research Center, Hannover, Germany
Corpus
Document ID: D001
Life is a characteristic that distinguishes physical entities that have biological processes, such as signaling and self-sustaining processes, from those that do not, either because such functions have ceased (they have died) or because they never had such functions and are classified as inanimate. Various forms of life exist, such as plants, animals, fungi, protists, archaea, and bacteria. Biology is the science that studies life. [...]
Source: https://en.wikipedia.org/wiki/Life
Document
Term | Freq. |
---|---|
life | D001: 3 |
characteristic | D001: 1 |
distinguishes | D001: 1 |
... | ... |
Term frequencies
Inverted index
🤔
What is the meaning of life?
↓
Inverted index
➔
Term matching (BM25)
Candidate documents
Candidate documents
Re-ranking model
Re-ranking step
Candidate documents
🤔
What is the meaning of life?
↓
Re-ranking model
Re-ranking step
Candidate documents
🤔
What is the meaning of life?
↓
Re-ranking model
Re-ranking step
#1 D371
Score: 0.95
#2 D222
Score: 0.85
#3 D984
Score: 0.71
#4 D017
Score: 0.60
#5 D101
Score: 0.45
Final ranking
Documents and query are embedded in a common vector space.
Retrieval is performed as an (approximate) nearest neighbor search operation.
The vector space is typically 768-dimensional.
The dense (semantic) relevance score of a document $d$ w.r.t. a query $q$ is computed as
$\phi_D (q, d)$ $ = $ $\zeta (q)$ $ \cdot $ $\eta (d)$,
where $\zeta$ is the query encoder and $\eta$ is the document encoder.
The dot product of two vectors encodes their proximity in the vector space.
A dense index contains $\eta (d)$ pre-computed for all documents in the corpus.
For a query $q$, retrieve $K^q_S$ using a sparse retriever and $K^q_D$ using a dense retriever.
Rank the documents by combining the scores.
$\phi (q, d)$ $ = \alpha \ \cdot $ $ \phi_S (q, d)$ $+\ (1 - \alpha) \ \cdot $ $\phi_D(q, d)$
AP@1k | R@1k | nDCG@10 | |
---|---|---|---|
Sparse retrieval | |||
BM25 | 0.331 | 0.697 | 0.519 |
Dense retrieval | |||
TCT-ColBERT | 0.279 | 0.576 | 0.612 |
ANCE | 0.254 | 0.510 | 0.633 |
Re-ranking (BM25 retrieval) | |||
TCT-ColBERT | 0.370 | 0.697 | 0.685 |
ANCE | 0.336 | 0.697 | 0.654 |
BERT | 0.283 | 0.697 | 0.520 |
Interpolation (with BM25) | |||
TCT-ColBERT | 0.406 | 0.697 | 0.696 |
ANCE | 0.387 | 0.697 | 0.673 |
BERT | 0.365 | 0.697 | 0.612 |
Table: TREC-DL 2019 document ranking results.
$\phi (q, d) = \alpha \cdot \phi_S (q, d) + (1 - \alpha) \ \cdot $ $\phi_D(q, d)$
Our approach: Fast re-ranking with dual-encoders.
A Fast-Forward index is a look-up table for pre-computed document representations:
$\phi_D^{FF} (q, d)$ $= \zeta (q) \cdot \eta^{FF} (d) $
The look-up operation is much more efficient than actual re-ranking:
nDCG@20 | 0.632 | 0.655 | 0.655 |
---|---|---|---|
Query latency [ms] | 1191 | 1203 | 253 |
Figure: TCT-ColBERT on the TREC-DL 2019 document ranking task.
In the following, we introduce techniques to improve efficiency even further.
Long documents are often chunked into passages and scored using a $\max$ operation:
$\phi_D^{FF} (q, d) = \max_{p_i \in d} (\zeta (q) \cdot \eta^{FF} (p_i))$
We propose sequential coalescing to compress maxP indexes.
$p_0$
$p_1$
$\mathcal{A}$
$p_2$
$\mathcal{A}$
$\hat{p}_1$
$p_3$
$\hat{p}_2$
$p_4$
$\hat{p}_3$
$p_5$
$p_6$
$\mathcal{A}$
$\hat{p}_4$
All passage representations of a document in the embedding space.
def coalesce(P):
P_new = []
A = []
A_avg = None
first_iteration = True
for p in P:
if first_iteration:
first_iteration = False
elif distance(p, A_avg) >= delta:
P_new.append(A_avg)
A = []
A.append(p)
A_avg = np.mean(A, axis=0)
P_new.append(A_avg)
return P_new
0 | 0.676 | |
---|---|---|
0.01 | 0.673 | |
0.015 | 27% | 0.672 |
0.02 | 46% | 0.667 |
0.025 | 60% | 0.661 |
0.03 | 70% | 0.659 |
0.035 | 77% | 0.657 |
0.04 | 80% | 0.653 |
0.045 | 82% | 0.657 |
0.05 | 83% | 0.658 |
0.1 | 84% | 0.659 |
Distance threshold $\delta$
Figure: TCT-ColBERT on TREC-DL 2019 document ranking with sequential coalescing.
Many tasks require only a small number of documents (e.g. top-10).
Lexical (sparse) and semantic (dense) scores are usually correlated.
In many cases, it is not necessary to compute all semantic scores.
We propose an approach for early stopping.
We set $\alpha = 0.5$:
$\phi (q, d) = 0.5 \cdot \phi_S (q, d) + 0.5 \cdot \phi_D (q, d)$
Stop the computation once the highest possible score $\phi_{\text{max}} (q, d)$ is too low to make a difference.
$\phi_D$ is not normalized. We approximate its maximum as the highest observed score:
$\phi_{\text{max}} (q, \text{D224}) = 0.5 \cdot 0.73 + 0.5 \cdot 0.67 = 0.70$
$\phi_S (q, d)$ | $\phi_D (q, d)$ | $\phi (q, d)$ ↓ | |
---|---|---|---|
|
|
|
|
|
|
|
|
We set $\alpha = 0.5$:
$\phi (q, d) = 0.5 \cdot \phi_S (q, d) + 0.5 \cdot \phi_D (q, d)$
Stop the computation once the highest possible score $\phi_{\text{max}} (q, d)$ is too low to make a difference.
$\phi_D$ is not normalized. We approximate its maximum as the highest observed score:
$\phi_{\text{max}} (q, \text{D224}) = 0.5\ \cdot$ $0.73$ $ +\ 0.5\ \cdot$ $0.67$ $ \gt $ $0.68$
$\phi_S (q, d)$ | $\phi_D (q, d)$ | $\phi (q, d)$ ↓ | |
---|---|---|---|
|
|
|
|
|
|
|
|
We set $\alpha = 0.5$:
$\phi (q, d) = 0.5 \cdot \phi_S (q, d) + 0.5 \cdot \phi_D (q, d)$
Stop the computation once the highest possible score $\phi_{\text{max}} (q, d)$ is too low to make a difference.
$\phi_D$ is not normalized. We approximate its maximum as the highest observed score:
$\phi_{\text{max}} (q, \text{D105}) = 0.5\ \cdot$ $0.49$ $ +\ 0.5\ \cdot$ $0.71$ $ \lt $ $0.72$
$\phi_S (q, d)$ | $\phi_D (q, d)$ | $\phi (q, d)$ ↓ | |
---|---|---|---|
|
|
|
|
|
|
|
|
100 | 4114 |
---|---|
90 | 4023 |
80 | 3889 |
70 | 3736 |
60 | 3544 |
50 | 3302 |
40 | 3050 |
30 | 2665 |
20 | 2179 |
10 | 1319 |
Cut-off depth $k$
Figure: ANCE on TREC-DL 2019 passage ranking with early stopping.
$k_S = 1000$ | $k_S = 5000$ | ||||||||
---|---|---|---|---|---|---|---|---|---|
Latency [ms] | AP@1k | R@1k | nDCG@20 | AP@1k | R@1k | nDCG@20 | |||
Hybrid retrieval | |||||||||
BM25, TCT-ColBERT | 582 | 0.463 | 0.809 | 0.615 | 0.469 | 0.852 | 0.621 | ||
BM25, ANCE | 582 | 0.479 | 0.809 | 0.621 | 0.488 | 0.846 | 0.632 | ||
Re-ranking (BM25 retrieval) | |||||||||
TCT-ColBERT | 1189 + 2 | 0.414 | 0.809 | 0.587 | 0.405 | 0.794 | 0.585 | ||
ANCE | 1189 + 2 | 0.426 | 0.809 | 0.593 | 0.422 | 0.761 | 0.604 | ||
BERT | 185 + 2 | 0.329 | 0.809 | 0.512 | 0.221 | 0.727 | 0.375 | ||
Interpolation (with BM25) | |||||||||
TCT-ColBERT | 1189 + 14 | 0.469 | 0.809 | 0.621 | 0.478 | 0.838 | 0.626 | ||
---Fast-Forward | 253 | 0.469 | 0.809 | 0.621 | 0.478 | 0.838 | 0.626 | ||
------coalesced | 109 | 0.440 | 0.809 | 0.594 | 0.447 | 0.837 | 0.607 | ||
ANCE | 1189 + 14 | 0.490 | 0.809 | 0.630 | 0.502 | 0.828 | 0.640 | ||
---Fast-Forward | 253 | 0.490 | 0.809 | 0.630 | 0.502 | 0.828 | 0.640 | ||
------coalesced | 121 | 0.471 | 0.809 | 0.622 | 0.479 | 0.823 | 0.629 | ||
BERT | 185 + 14 | 0.460 | 0.809 | 0.602 | 0.459 | 0.839 | 0.601 | ||
Table: TREC-DL 2020 document ranking results. Latency is reported for $k_S = 5000$ on GPU and CPU.
$k_S = 1000$ | $k_S = 5000$ | ||||||
---|---|---|---|---|---|---|---|
Latency [ms] | AP@1k | RR@10 | AP@1k | RR@10 | |||
Hybrid retrieval | |||||||
BM25, TCT-ColBERT | 307 | 0.434 | 0.894 | 0.454 | 0.902 | ||
BM25, ANCE | 307 | 0.410 | 0.856 | 0.422 | 0.864 | ||
Re-ranking (BM25 retrieval) | |||||||
TCT-ColBERT | 186 + 2 | 0.426 | 0.827 | 0.439 | 0.842 | ||
ANCE | 186 + 14 | 0.389 | 0.836 | 0.392 | 0.857 | ||
BERT | 185 + 14 | 0.353 | 0.715 | 0.275 | 0.576 | ||
Interpolation (with BM25) | |||||||
TCT-ColBERT | 186 + 14 | 0.438 | 0.894 | 0.460 | 0.902 | ||
---Fast-Forward | 114 | 0.438 | 0.894 | 0.460 | 0.902 | ||
------early stopping | 72 | - | 0.894 | - | 0.902 | ||
ANCE | 186 + 14 | 0.417 | 0.856 | 0.435 | 0.864 | ||
---Fast-Forward | 114 | 0.417 | 0.856 | 0.435 | 0.864 | ||
------early stopping | 52 | - | 0.856 | - | 0.864 | ||
BERT | 185 + 2 | 0.378 | 0.809 | 0.392 | 0.532 | ||
Table: TREC-DL 2019 passage ranking results. Latency is reported for $k_S = 5000$ on GPU and CPU.
leonhardt@l3s.de
pip install fast-forward-indexes
Icons created by juicy_fish (Flaticon).