Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Strix Halo Unified Memory AI
Live benchmarking of the Strix Halo Ryzen AI Max 395, demonstrating PyTorch AOTriton FA and vLLM builds, performance metrics, and unified‑memory AI feasibility.
One of my hobbies is poking around on AI/ML RDNA3 and I had early access to a Framework Desktop (which I can bring to show off) and I will show what it can do - how fast it runs, what software it supports. Besides testing and improving performance, I also created the first build scripts for building PyTorch w/ AOTriton FA and vLLM.
Benchmarks PyTorch/vLLM LLM inference performance using ROCm and Flash Attention.
Strix Halo LLM inference optimization uses Vulkan/ROCm with LPDDR5x memory tuning.
- GPT-4GPT-4 is OpenAI’s large multimodal model: it processes both text and image inputs, delivering human-level performance on complex professional and academic benchmarks.This is OpenAI’s latest milestone in scaling deep learning: a large multimodal model accepting both text and image inputs. It demonstrates a significant capability leap over its predecessor, scoring in the top 10% on a simulated bar exam (GPT-3.5 scored in the bottom 10%). The model handles nuanced instructions and long-form content, supporting context windows up to 32,768 tokens (32K model). This capacity allows processing up to 25,000 words in a single, complex prompt. GPT-4 is engineered for enhanced reliability, steerability, and advanced reasoning across diverse tasks.
- Llama-2Llama 2 is Meta AI's powerful, openly accessible family of large language models (LLMs), featuring models from 7B to 70B parameters for research and commercial applications.Llama 2 is Meta AI's next-generation LLM family, released for free research and commercial use. The collection includes both pre-trained foundation models and instruction-tuned 'Chat' variants, scaling from 7 billion (7B) up to 70 billion (70B) parameters. Key technical upgrades over Llama 1 involve training on 2 trillion tokens (40% more data) and doubling the context length to 4096 tokens. The Llama-2-chat models were rigorously aligned using Reinforcement Learning from Human Feedback (RLHF), positioning them as a top-tier, openly available option for developers building advanced generative AI solutions.
- PyTorchPyTorch is the open-source machine learning framework: it provides a Python-first tensor library with strong GPU acceleration and a dynamic computation graph for building deep neural networks.PyTorch, developed by Meta AI, is a premier open-source deep learning framework favored in both research and production environments. Its core is a powerful tensor library (like NumPy) optimized for GPU acceleration, delivering 50x or greater speedups for complex computations. The key differentiator is its 'Pythonic' design and dynamic computation graph (eager execution), which allows for rapid prototyping and simplified debugging compared to static-graph frameworks. Leveraging its Autograd system for automatic differentiation, practitioners build and train models for computer vision and NLP; major companies like Tesla (Autopilot) and Microsoft utilize PyTorch for critical AI applications.
- LangChainThe open-source framework for building and deploying reliable, data-aware Large Language Model (LLM) applications.LangChain is the essential framework for engineering LLM-powered applications: it simplifies connecting models (like GPT-4 or Claude) to external data, computation, and APIs. The platform provides a modular set of components—Chains, Agents, Tools, and Memory—allowing developers to quickly build complex workflows like Retrieval-Augmented Generation (RAG) pipelines and sophisticated conversational agents. Its Python and JavaScript libraries, combined with LangChain Expression Language (LCEL), offer a standardized interface for rapid prototyping and moving applications to production with confidence.
- OpenAI APIOpenAI API: Your direct gateway to cutting-edge AI models (GPT-4o, DALL-E 3, Whisper), enabling scalable, multimodal intelligence integration into any application.The OpenAI API provides authenticated, programmatic access to a powerful suite of generative AI models. Developers leverage REST endpoints and official libraries (Python, Node.js) to integrate capabilities like advanced text generation (GPT-4o), image creation (DALL-E 3), and speech-to-text transcription (Whisper). This platform is engineered for scale, supporting millions of daily requests for tasks from complex reasoning to real-time customer support agents, ensuring your application gets reliable, state-of-the-art intelligence.
Related projects
Writing Web Fiction With Modern Agentic AI
Tokyo
Learn practical agentic techniques for writing coherent, long web fiction, including long-range planning, chapter prediction, multi‑agent debate, and…
AI Computer
Berlin
Learn how to build a desktop PC with an RTX 3090 for local AI workloads, covering hardware assembly, software…
Fully Customizable Voice AI with multi-modal open source LLMs and esp32 (clone your own voice too with simple tools)
Tokyo
A step‑by‑step guide to building a local voice AI with EchoKit, swapping ASR/TTS models, integrating open‑source LLMs, and…
Test2Synth: controlling hardware music geart from text
Tokyo
Learn how to connect a language model to a MIDI synthesizer, using Python to translate text prompts into…
MapScroll: Prompt to Maps within minutes
Tokyo
Learn how MapScroll turns simple prompts into interactive, narrative-driven maps, with live demos of creator, educator, and explorer…
Run Local, open source AI
Singapore
Learn how to run open-source models like Llama3, Mistral, and Gemma locally using Jan.ai and Cortex.so, with practical…