Running OpenCode with local models on NVidia DGX Spark | Seattle .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

January 31, 2026 · Seattle

OpenCode Local Models on DGX

See OpenCode agents run with local LLMs on a DGX Spark, comparing performance metrics against OpenRouter and Anthropic models.

Overview
Tech stack
  • OpenCode
    OpenCode is the open-source AI coding agent (CLI tool), integrating LLMs like GPT-5 and Claude Sonnet 4 directly into the terminal for fast, context-aware development.
    OpenCode is the open-source AI coding agent, built for terminal-first developers who demand speed and privacy. It connects your local files, Git history, and a choice of LLMs (e.g., OpenAI's GPT-5 Nano, Anthropic's Claude Sonnet 4) to execute complex tasks directly from the command line . The tool bypasses IDE and browser dependencies, allowing developers to triage issues, fix errors, or implement features with commands like `opencode fix error in main.go` . With over 26,000 GitHub stars by October 2025, OpenCode delivers a secure, context-aware coding partner that keeps your code local and your workflow efficient .
  • NVIDIA DGX Spark
    The desktop AI supercomputer: DGX Spark delivers 1 petaFLOP of FP4 performance via the GB10 Grace Blackwell Superchip.
    This is the DGX Spark: your personal AI supercomputer, built for serious local development. It packs the GB10 Grace Blackwell Superchip (20-core Arm CPU, Blackwell GPU) and 128GB of unified memory into a compact desktop form factor (1.2 kg). You can prototype, fine-tune, or inference models up to 200 billion parameters right at your desk. It ships ready with DGX OS and the full NVIDIA AI software stack (CUDA, TensorRT), ensuring a seamless path from local work to data center deployment.
  • Olmo-3
    The Allen Institute for AI’s latest open-source language model featuring 440 billion tokens of training and full pipeline transparency.
    OLMo-2 (the architecture driving the Olmo-3 series) delivers a state-of-the-art open language model framework built by the Allen Institute for AI (AI2). This iteration prioritizes data integrity and reproducibility: providing the full training code, weights, and the Dolma dataset (3 trillion tokens). By utilizing a 7-billion parameter dense architecture, it matches or exceeds Llama 3 performance on benchmarks like MMLU and GSM8K while remaining entirely accessible for academic and commercial audit.
  • GLM-4
    Zhipu AI’s flagship large language model featuring a 128k context window and performance metrics rivaling GPT-4.
    GLM-4 is Zhipu AI’s high-performance foundational model designed to compete directly with GPT-4. It handles a 128,000-token context window (enough for a 300-page document) and executes complex tasks via its All Tools framework: browsing, code execution, and image generation. Performance metrics on MMLU and GSM8K confirm its top-tier status in reasoning and mathematics. The ecosystem includes specialized versions like GLM-4-9B for edge deployment and the full-scale API for enterprise applications. It remains the leading choice for bilingual Chinese-English deployments requiring precision and scale.
  • GPT-OSS:20b
    A 20-billion parameter open-source language model optimized for high-throughput inference and transparent architectural auditing.
    GPT-OSS:20b delivers a robust alternative to proprietary systems by utilizing a 20B parameter dense transformer architecture. It balances computational efficiency with deep reasoning capabilities (ideal for complex coding tasks and long-form content generation). Built on open datasets, this model allows developers to self-host on enterprise hardware like the NVIDIA A100 (80GB) while maintaining full control over data privacy and fine-tuning weights.

Related projects