Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Transformer Lab: Local to Distributed ML
See a demo scaling ML training from a local notebook to a GPU cluster, covering checkpoint recovery, hyperparameter sweeps, and unified experiment tracking.
Our CEO, Co-Founder Ali Asaria will be there to present. He is the main technical SME on this topic and is best positioned to share and answer questions. We assure you this will not be a pitch and the audience will get technical insights/how-tos/best practices we’ve learned working with top research labs around the world.
We’ll demo the use of the tool we built to scale from a local Jupyter notebook to a distributed training run across a cluster of GPUs. We’ll cover how we handled the “boring but critical” parts of the training workflow: automatic checkpoint recovery for spot instances, one-line hyperparameter sweeps, and unified experiment tracking that works across AMD, NVIDIA, and Apple Silicon.