Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
MindServe AI: GPU Vision and RAG
See a deployed system analyzing tennis video on GPU, streaming overlays, and using RAG with a vector database for real-time coaching insights.
MindServe AI is a fully-deployed real-time inference system that analyzes tennis match video on GPU and streams structured coaching insight back to the user. The backend runs YOLOv8 + MediaPipe pose models in parallel, detects rallies using a state machine, and pushes frame overlays via WebSockets as processing happens. At the same time, structured match data is fed into a retrieval-augmented coaching engine using Pinecone + LLM reasoning to provide mental-performance feedback.
This talk will walk through the full infrastructure stack — from GPU scheduling to async model serving to vector-database-driven reasoning — all demonstrated live with running code.