Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
ACE: Self-Improving LLM Agents
A walkthrough of Agentic Context Engineering, showing how Generator, Reflector, and Curator iteratively refine context to counter brevity bias and boost LLM performance without retraining.
In this talk, I’ll present a practical walkthrough of Agentic Context Engineering (ACE) — a novel framework introduced in the paper “Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models.” I will explain how ACE addresses challenges like brevity bias and context collapse by enabling agents to iteratively refine their own contextual understanding through three core components: the Generator, Reflector, and Curator. Using a simplified case study, I’ll demonstrate how a language-model-based agent can evolve its behavior and improve performance over time without parameter updates, simply by re-engineering its context and memory representations.
- GPT-4oGPT-4o (omni) is OpenAI's flagship multimodal model: it delivers GPT-4 intelligence with native, real-time processing across text, audio, and vision.This is GPT-4o, OpenAI’s 'omni' model: a single neural network natively handling text, audio, and image inputs and outputs. It matches GPT-4 performance on English text and code, but surpasses it on non-English language, vision, and audio benchmarks. The speed is a major upgrade: it achieves human-level responsiveness in voice, with an average response time of 0.32 seconds (a significant jump from GPT-4’s 5.4 seconds). Developers get a 128K token context window and a model that is more cost-efficient than its predecessor, making high-intelligence, real-time applications viable.
- AdaA high-level, statically typed language designed for real-time systems where safety and reliability are non-negotiable.Ada remains the gold standard for high-integrity software in aerospace, defense, and rail. Originally commissioned by the U.S. Department of Defense (MIL-STD-1815), it provides robust compile-time checks and strong typing to eliminate common runtime errors. Modern standards like Ada 2022 integrate seamlessly with the SPARK toolset for formal verification. Whether managing flight control systems for Boeing or securing communication protocols, Ada delivers predictable performance through its native support for tasking and deterministic memory management.
- LangChainThe open-source framework for building and deploying reliable, data-aware Large Language Model (LLM) applications.LangChain is the essential framework for engineering LLM-powered applications: it simplifies connecting models (like GPT-4 or Claude) to external data, computation, and APIs. The platform provides a modular set of components—Chains, Agents, Tools, and Memory—allowing developers to quickly build complex workflows like Retrieval-Augmented Generation (RAG) pipelines and sophisticated conversational agents. Its Python and JavaScript libraries, combined with LangChain Expression Language (LCEL), offer a standardized interface for rapid prototyping and moving applications to production with confidence.
- LangGraphA low-level orchestration framework for building long-running, stateful, and cyclic multi-agent systems using a graph-based architecture.LangGraph is the specialized, low-level runtime for developing complex AI agents, extending the LangChain ecosystem to handle intricate, stateful workflows. It models the agent's logic as a directed graph: nodes represent actions (LLM calls, tool use), and conditional edges dictate the flow, enabling critical features like cycles (loops) for iterative reasoning. This graph-based approach ensures durable execution, allowing agents to persist through failures and resume operations. Key capabilities include comprehensive memory management via a shared state object and built-in human-in-the-loop functionality (interrupts) for external oversight. This robust framework is trusted by production teams at companies like Klarna and Replit for deploying scalable, resilient agent architectures.
- PythonPython: The high-level, general-purpose language built for readability, powering everything from web backends to advanced machine learning models.Python is the high-level, general-purpose language prioritizing clear, readable syntax (via significant indentation), ensuring rapid development for any team . Its ecosystem is massive: use it for robust web development with frameworks like Django and Flask, or leverage its power in data science with libraries such as Pandas and NumPy . The Python Package Index (PyPI) provides thousands of community-contributed modules, offering immediate solutions for tasks from network programming to GUI creation . The language is actively maintained by the Python Software Foundation (PSF), with the stable release currently at Python 3.14.0 (as of November 2025) .
Related projects
Make any LLM improve itself automatically
Bogotá
Learn how Handit lets an LLM monitor its output, detect weaknesses, and automatically update its parameters using real‑time…
Agentes AI - Puesta en marcha
Bogotá
Learn how AI agents work and how to enhance language model functionality using open‑source tools and Google Vertex…
LLM Context Engineering 2.0->3.0
Berlin
Learn how to build reliable synthetic datasets using LLM context engineering, demonstrated with an automated design thinking example.
The Future of AI: How Self-Improving AI is Changing the Game
Medellín
Explore why most AI systems underperform, learn the difference between monitoring and true optimization, and see how HandIt.ai…
Meta-Agents in the Wild: Building, Simulating & Scoring Generative Agents
Medellín
This talk covers building meta-agentic systems that create and evaluate other agents, using simulations to test performance and…
AnythingLLM: la plataforma integral para ejecutar modelos LLM locales y agentes de IA
Manizales
Demo of AnythingLLM: installing a local AI app, loading GGUF models, using RAG with PDFs/Word/CSV, and connecting OpenAI,…