Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
CityPulseNYC: Multi-Modal RAG
Demonstrates building CityPulseNYC, a borough‑based video platform using Whisper transcription, LLaVA vision, 384‑dim embeddings, PGVector search, two‑stage retrieval, activity scoring, and async processing.
I’ll demonstarate CityPulseNYC, a hyperlocal video platform for NYC residents and tourists that lests users discover whats happening in their borough through semantic search and video feeds.
Technical Walkthrough will show:
THE SOFTWARE:
Borough based video feed(Manhattan, Brooklyn, Queens, Bronx, Staten Island) with activity based ranking
“Ask NYC” a natural language search across all boroughs and all kinds of video content(“What festivals are happening in Brooklyn this week?”)
3 Tier processing making videos playable in 5 seconds or less, fully searchable in 90 seconds
THE IMPLEMENTATION:
Multimodal RAG pipeline: Whisper transcription+ LLaVa Vision analysis -> 384 dim embeddings -> PG Vector Search
Two stage result retrieval: Strict semantic search falling back to expanded time window
Activity Scoring Algorithm prioritizing results related to user search
ARCHITECTURE:
DB schema, Software Patterns used
Live Demo will include actual NYC video content, SQL Queries showing vector operations