Strix Halo Unified Memory AI

Live benchmarking of the Strix Halo Ryzen AI Max 395, demonstrating PyTorch AOTriton FA and vLLM builds, performance metrics, and unified‑memory AI feasibility.

Overview

One of my hobbies is poking around on AI/ML RDNA3 and I had early access to a Framework Desktop (which I can bring to show off) and I will show what it can do - how fast it runs, what software it supports. Besides testing and improving performance, I also created the first build scripts for building PyTorch w/ AOTriton FA and vLLM.

Links

https://github.com/lhl/strix-halo-testing/
Benchmarks PyTorch/vLLM LLM inference performance using ROCm and Flash Attention.
https://strixhalo-homelab.d7.wtf/AI/AI-Capabilities-Overview
Strix Halo LLM inference optimization uses Vulkan/ROCm with LPDDR5x memory tuning.

Tech stack