Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
RAG Authorization for Sensitive Data
Learn how to protect sensitive data in RAG pipelines using user permissions and relationship-based access control for granular security.
I am going to demo a way to protect users’ sensitive information in the RAG pipeline based on the user’s permission. I will also discuss the relationship-based access control concept used in the RAG pipeline to provide granular access control.
- ChromaChroma is the open-source vector database engineered for AI: it simplifies the storage and retrieval of vector embeddings for large language models (LLMs).Chroma functions as the critical memory layer for modern Generative AI applications, specifically powering Retrieval-Augmented Generation (RAG). It stores vector embeddings (numerical representations of unstructured data like text or images) and associated metadata. This architecture enables low-latency, high-accuracy similarity searches using metrics like cosine distance. Developers can deploy it locally or use the managed Chroma Cloud, leveraging Python and JavaScript/TypeScript SDKs for rapid prototyping and production-scale LLM context retrieval.
- LangChainThe open-source framework for building and deploying reliable, data-aware Large Language Model (LLM) applications.LangChain is the essential framework for engineering LLM-powered applications: it simplifies connecting models (like GPT-4 or Claude) to external data, computation, and APIs. The platform provides a modular set of components—Chains, Agents, Tools, and Memory—allowing developers to quickly build complex workflows like Retrieval-Augmented Generation (RAG) pipelines and sophisticated conversational agents. Its Python and JavaScript libraries, combined with LangChain Expression Language (LCEL), offer a standardized interface for rapid prototyping and moving applications to production with confidence.
- OpenFGAOpenFGA is the high-performance, open-source Fine-Grained Authorization (FGA) engine, built on Google's Zanzibar model.OpenFGA delivers fast, scalable authorization, implementing a Relationship-Based Access Control (ReBAC) system inspired by Google's Zanzibar paper. It provides a simple, expressive modeling language for defining access policies and offers low-latency authorization checks via HTTP and gRPC APIs. As a CNCF Incubating Project, it is production-ready and supports multiple SDKs (Go, .NET, JavaScript, Python), enabling developers to centralize complex authorization logic outside application code for increased velocity and compliance.
- RAGRAG (Retrieval-Augmented Generation) is the GenAI framework that grounds LLMs (like GPT-4) on external, verified data, drastically reducing model hallucinations and providing verifiable sources.RAG is a critical GenAI architecture: it solves the LLM 'hallucination' problem by inserting a retrieval step before generation. A user query is vectorized, then used to query an external knowledge base (e.g., a Pinecone vector database) for relevant document chunks (typically 512-token segments). These retrieved facts augment the original prompt, providing the LLM (e.g., Gemini or Llama 3) the specific, current, or proprietary context required. This process ensures the final response is accurate and grounded in domain-specific data, avoiding the high cost and latency of full model retraining.
- Vector databaseA vector database is a specialized system: it stores, indexes, and queries high-dimensional data embeddings for rapid, large-scale semantic similarity search.This technology is purpose-built to manage unstructured data (text, images, audio) by converting it into numerical arrays called vector embeddings (often 100 to 1,000+ dimensions). Unlike traditional databases, a vector database uses algorithms like HNSW (Hierarchical Navigable Small World) to index these vectors, enabling lightning-fast Approximate Nearest Neighbor (ANN) searches based on distance metrics (e.g., cosine similarity). This capability is critical for modern AI: it powers Retrieval-Augmented Generation (RAG) to provide contextual memory for Large Language Models (LLMs), drives semantic search engines, and delivers real-time, personalized recommendations with high recall accuracy.
Related projects
How we made our RAG truly multimodal
Tokyo
A deep dive into building multimodal RAG from scratch, covering data ingestion, embedding comparisons (CLIP vs. others), and…
The next step in RAG
Amsterdam
Explore how Oasis creates automatic taxonomies and interpretable vector embeddings to build a low‑prompt, auditable RAG system ready…
Design-by-Transparency: Fixing Authority Before Execution
Tokyo
Learn how to fix AI authority before execution. This talk demonstrates explicitly defining intent, authority, and stop conditions…
Information retrieval: Beyond Cosine Similarity
Dublin
This talk demonstrates methods to improve RAG pipeline retrieval accuracy beyond basic cosine similarity, addressing information loss in…
SafeGuide: An Offline AI Assistant for Emergency Guidance
Tokyo
Learn how SafeGuide runs a compact language model locally to deliver clear, actionable safety instructions during emergencies without…
Transparent Trust: What If AI Showed Its Work?
Raleigh
See how AI works with a prompt-first system. Outputs are inspectable and debuggable, building trust through transparency and…