Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
OpenWeb UI LiteLLM Safe Chat
Learn how to assemble an open‑source chat system with OpenWeb UI, LiteLLM, Azure/Bedrock models, internal RAG, voice, image, and code integration.
Stop building chat applications, reuse existing ones and extend them ! Sharing a technical journey to use a safe and compliant solution for chatting with private project data, leveraging local and remote LLMs, internal RAG systems, MCP servers, live voice, image generation, web search, and a code interpreter. The result is similar to ChatGPT but built entirely on open-source technologies and trusted cloud providers LLM APIs.
Using OpenWeb UI as the chat front-end, connected to a foundation model on Azure AI and integrated with internal RAG system for documentation. LiteLLM serves as the OpenAPI layer for both OpenWeb UI and server calls, supporting claude code, codex-cli, and a custom workaround for gemini-cli, as we operate exclusively on AWS Bedrock and Azure AI.
OpenWebUI/LiteLLM deployment configurations using Docker Compose, Qdrant vector database, and PostgreSQL.
Excalidraw is a drawing canvas managing themes via localStorage and `prefers-color-scheme` media queries.
- AWS BedrockAWS Bedrock is the fully managed, serverless platform providing a unified API gateway to diverse, high-performing foundation models (FMs) from providers like Anthropic, AI21 Labs, and Amazon Titan.Amazon Bedrock is your fully managed, serverless platform for building and scaling generative AI applications. It offers a single API to access a curated selection of industry-leading foundation models (FMs) from partners (e.g., Anthropic's Claude, Meta's Llama 3.1) and Amazon (Titan family). Developers leverage core features: use Knowledge Bases for Retrieval Augmented Generation (RAG) with proprietary data, deploy Agents for complex task automation, and implement Guardrails for responsible AI policies. This streamlined approach ensures enterprise-grade security and simplifies model customization (fine-tuning) without managing underlying infrastructure.
- LiteLLMLiteLLM is the unified LLM gateway: call 100+ models (OpenAI, Anthropic, Azure, etc.) using a single, standardized OpenAI-compatible API.LiteLLM acts as your production-grade LLM gateway, simplifying complex multi-model deployments. It unifies over 100 LLM providers—including OpenAI, Anthropic, and VertexAI—under a single, consistent API call structure (the OpenAI format). This standardization eliminates SDK friction. Key features include the LiteLLM Router for automatic retry and fallback logic across deployments, ensuring high reliability. Additionally, the Proxy Server centralizes cost tracking, allows granular budget setting per virtual key, and provides load balancing, making it essential for ML Platform teams managing scalable, cost-optimized Gen AI applications.
- QdrantQdrant is an open-source, Rust-powered vector database and search engine: it delivers high-performance, scalable similarity search for AI applications.Qdrant functions as a production-ready vector database, purpose-built in Rust for unmatched speed and reliability, even when processing billions of high-dimensional vectors. It provides a convenient API to store, search, and manage vector embeddings (points) along with optional metadata (payloads). Key features include advanced filtering on those payloads, support for multiple distance metrics (Cosine, Dot Product, Euclidean), and cloud-native scalability. Developers leverage Qdrant for critical AI workloads like Retrieval-Augmented Generation (RAG) systems and large-scale recommendation engines, deploying via Docker, self-hosting, or the managed Qdrant Cloud service.
- MinIOMinIO: High-performance, S3-compatible object storage, optimized for AI/ML and cloud-native infrastructure.MinIO is a high-performance, cloud-native object storage server, fully compatible with the Amazon S3 API. Built on a lightweight, single-binary, shared-nothing architecture, it delivers industry-leading throughput for demanding workloads: think AI/ML, big data analytics, and containerized applications on Kubernetes. It is 100% open-source, licensed under GNU AGPLv3, and designed for global-scale deployment from the private cloud to the edge.
- KubernetesKubernetes (K8s): Production-grade container orchestration: automate deployment, scaling, and management across your cluster.Kubernetes (K8s) is your control plane for planet-scale container orchestration: it automates the deployment, scaling, and management of containerized applications across your cluster. Built on 15 years of Google's production experience (Borg), K8s ensures your *desired state* is always maintained. Core resources like Pods, Deployments, and Services manage auto-scaling, load balancing, and self-healing for you. You interact directly with the API server using `kubectl` (the command-line tool) to execute zero-downtime rollouts and rapid rollbacks. As a CNCF project, it provides vendor-neutral flexibility for any infrastructure: cloud, on-premises, or hybrid.
Related projects
AI Crawling behavior - Cloudflare view
Montreal
An analysis of AI crawler activity across 25% of the web using Cloudflare Radar data, highlighting observed behaviors,…
Nandayo - AI Driven Support Agent
Montreal
Learn how Nandayo, an AI-driven support agent, integrates with any infrastructure to automatically monitor, triage, and resolve routine…
SirPlotsALot
Montreal
Learn how to connect LLMs with JavaScript‑based data analysis, using prompting, function calling, and secure sandboxed code for…
An abecedary of AI: towards 26 generative vibe coded AI experiments
Montreal
Explore how a letter‑by‑letter AI lab uses Linux cron, API chaining, and multimodal models to create autonomous generative…
Data driven AI personas in Marketing, or how to hallucinate in the right direction
Montreal
Discover how to create marketing personas by grounding LLM outputs in census data, then test campaign ideas using…
Working with AI: Code Conversion (Delivered by Dwayne Forde)
Toronto
Learn how an LLM‑driven workflow automates bulk code conversion, preserving context, and frees engineers to tackle critical project…