Building Knobase: Personalized AI Tutoring with mem0 Memory, ZeroEntropy RAG, and Real-Time Content Safety

Explore Knobase's five-layer AI tutoring system: personalized memory, ZeroEntropy RAG, real-time context, no-code teacher builder, and education safety stack, all orchestrated for <2-second latency.

Overview

Context: Knobase powers personalized AI tutoring for 1,000+ students across various schools in Asia.

The core insight we learned: teachers need AI agents they can configure and trust without writing code, and students need personalization that remembers their learning style across sessions — not just generic ChatGPT wrappers that hallucinate or give direct answers when Socratic questioning is more effective.

This talk is a technical deep-dive into how we orchestrate multiple AI systems to deliver that experience at scale, walking through five implementation layers:

Personalization with mem0 — A hybrid memory architecture combining mem0’s semantic vector search with a local confirmed-memories table in Supabase. As students chat, we extract preferences, learning styles, goals, and challenges via regex pattern matching and prompt scoring (confidence threshold ≥ 0.85). Students confirm these memories (“Yes, I’m preparing for IB exams”), and they’re injected into every subsequent chat. A 10th-grader studying physics gets reminders about their preference for step-by-step explanations; a university student preparing for finals gets context about their exam timeline. I’ll walk through getContextualMemoriesForPrompt() and how we merge local + mem0 results to build a per-student profile that persists across days and subjects.

RAG with ZeroEntropy — Teachers upload textbooks, lecture slides, problem sets, and institutional syllabi. Document ingestion pipeline: file upload → Supabase Storage → base64 encoding → ZeroEntropy with semantic chunking (chunk_size: 1800, overlap: 200). Collections are scoped per school ( school_{id} ) so students only retrieve content their teachers authorized. Retrieval uses topSnippets queries with metadata filtering by document/knowledge IDs, plus a parallel RAG agent that expands queries and aggregates deduplicated results. This means when a Harrow student asks “What’s Newton’s second law?”, the AI cites their specific uploaded physics textbook, not generic web content. I’ll show the filter-building logic and how we resolve documents through bot → knowledge → collective → document chains.

Real-Time Context API — Teachers can connect external data sources (Google Sheets of upcoming assignments, Notion databases of class resources, live sports scores for a journalism class analyzing data) via webhook-based context providers. On every message, we call registered providers, AI-process the response with token optimization (60-90% reduction), and inject it alongside RAG results. Example: A history teacher at ISF Academy configured a timeline of World War II events that updates the AI’s context window in real time, so students always get era-appropriate answers. I’ll trace the full flow from chat.context_config → provider webhook → context processing → system prompt assembly.

No-Code AI Chat Builder for Teachers — Educators configure role, tone, age group (elementary/middle/high school/university), complexity, subject, language, Socratic mode toggle (forces the AI to ask guiding questions instead of giving direct answers), citation preferences, and custom instructions — all stored as custom_details JSON. Organization-level master prompts override per-bot settings for school-wide safety policies. I’ll show how the system prompt is assembled in 10 steps: master prompt → bot intro → RAG context → memory context → custom instructions. A teacher creating a “Socratic Math Tutor” for 8th graders clicks 6 dropdowns and writes 2 sentences of instruction; the system generates a 2,000-token prompt behind the scenes that enforces age-appropriate language, refuses to solve homework directly, and cites only the uploaded textbook.

Education Safety Stack — Real-time prompt scoring (Clarity/Specificity/Task Definition/Context/Structure on a 0-100 scale) via a Supabase Edge Function runs on every student input. Content flagging across 7 categories (sexual content, bullying, profanity, racial bias, political sensitivity, harmful advice, PII detection) plus custom organization-defined flags (e.g., Harrow added “exam cheating detection”). A daily digest cron emails flagged messages to designated safety managers. Student interest keywords and learning purpose analysis are extracted as a side effect of scoring and fed back into mem0, creating a feedback loop where the AI becomes more personalized the more the student uses it. We’ve processed 16,000+ messages since December 2025 with this stack in production.

** Code walkthrough will focus on the chat route orchestration (~3,700 lines) that ties all five layers together in a single request lifecycle. No slides — just live code, architecture diagrams on a whiteboard, and real examples from our production deployment.

Links

Tech stack