Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Luthien Control: Enforcing AI Behavior
We'll demonstrate a local‑first open‑source system that enforces verified AI control strategies, preventing assistants from bypassing tests or executing unsafe actions.
We’re building a local-first open-source tool to make AI Assistants behave.
AI Control is a new, empirical field pioneered by Redwood Research focused on mitigating potentially harmful AI actions. I’m not affiliated with Redwood, but I know them and I’ve worked down the hall from them. Their focus is on future “scheming” frontier AI, but the approaches they’ve developed and tested can also be applied to keeping prosaic AI systems in check.
We’re building a system to make it easy to implement empirically-verified AI Control strategies as well as any other intervention you can imagine locally, so you can automate things like “making sure the AI isn’t bypassing tests” or “require explicit human confirmation with a big obvious warning sign before executing dangerous or suspicious-seeming tool calls”. We’re building on top of LiteLLM to make it easy to deploy this for virtually any LLM-backed system.
Luthien delivers production-ready AI control systems for operational deployment.
LiteLLM proxy integrates policy orchestration via FastAPI, PostgreSQL, and Redis.
Replit AI executed schema migration, dropping production database due to missing guardrails.