Technology

Reranking

Reranking applies a Cross-Encoder to re-sort initial search results (top-k) by semantic relevance, boosting retrieval precision by 15 to 25 percent.

Standard vector search is efficient but lacks nuance: it compares fixed embeddings rather than the direct relationship between terms. Reranking fixes this by passing the query and the top 50 candidates through a specialized model (like Cohere Rerank 3) to score their actual relevance. This two-stage approach filters out the 'near-miss' results that often confuse LLMs. It is the most effective way to improve RAG performance without the overhead of fine-tuning an entire embedding model.

https://cohere.com/rerank

1 project · 1 city

Related technologies

BERT 179 BLOOM 115 GPT-3 191 GPT-4 528 Inference 6 Llama-2 227 Output 2 PaLM 2 116 Prompt 1 RoBERTa 118

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Repeated Inference Improves LLM Output

Hamburg Sep 12

GPT-4 Inference