Technology
Bi-encoder
A dual-tower architecture that maps queries and documents into a shared vector space for millisecond-scale semantic retrieval.
Bi-encoders process inputs through two independent BERT-style networks to generate fixed-size embeddings (typically 768 or 1024 dimensions). By decoupling the query and document processing, this architecture allows for the pre-computation and indexing of millions of document vectors using libraries like FAISS or Pinecone. While Cross-encoders offer higher accuracy by modeling token-level interactions, the Bi-encoder is the industry standard for the initial retrieval stage in RAG pipelines due to its ability to perform sub-second similarity searches across massive datasets.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1