Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Gemini: Production ESG KPI Extraction
Learn to build a production ESG KPI extraction pipeline. This talk covers cost control with Gemini API caching, parallel processing, structured output validation, and multi-document conflict resolution.
A practical deep-dive into building an AI-powered system that extracts 170+ structured KPIs from ESG/financial PDF documents. I’ll walk through the real engineering decisions behind our two-stage extraction pipeline: how we use Gemini’s Files API with explicit caching to control costs, parallel structured outputs processing to speed up extraction, LLM-based conflict resolution for multi-document scenarios and how we evaluate the pipeline. Expect code snippets, architecture diagrams, and honest lessons learned from development.
Klartext AI automates verifiable, explainable KPI extraction from complex PDFs.
- GoogleThe global technology leader: Google organizes the world's information and delivers essential services (Search, Android, YouTube) to billions of users daily.Google (a subsidiary of Alphabet Inc.) dominates the digital landscape, starting with its core search engine, which processes trillions of queries annually. The company’s vast ecosystem extends far beyond search: Android powers over 70% of the world's smartphones; YouTube serves billions of hours of video content; and Google Cloud Platform (GCP) competes aggressively in the enterprise space. They drive innovation using AI and machine learning, focusing on making information universally accessible and useful through products like Gemini and the Pixel hardware line. This strategic portfolio maintains Google’s position at the center of consumer and enterprise technology.
- GeminiGoogle's natively multimodal AI model: understands and operates across text, code, audio, image, and video.Gemini is Google's most capable and general AI model, engineered from the ground up to be natively multimodal: it seamlessly understands and combines information across text, code, audio, image, and video inputs. The technology is optimized for flexibility, running efficiently on everything from data centers to mobile devices. It is deployed in three key sizes: Ultra (for highly complex tasks), Pro (for broad scaling), and Nano (for efficient on-device tasks). Developers access this power via the Gemini API to build next-generation applications.
- Gemini APIIntegrate Google's multimodal Gemini models (Pro, Flash) into your application via REST or SDKs: generate content, process up to 1000-page PDFs, and execute code with a 2-million token context.The Gemini API delivers Google's most advanced models (Gemini 3 Pro, 2.5 Flash) directly into your applications. Leverage its multimodal power: process text, images, video, and audio inputs for tasks like content generation, summarization, and visual understanding. Utilize key features including the 2-million token context window, structured JSON output, and function calling to build complex, reliable agents. Choose your integration: use the standard REST API, streaming endpoints (SSE), or the Live API (WebSockets) for real-time conversational experiences. Get started with the free tier and robust SDKs for Python, Java, Go, and more.
- FastAPIFastAPI is a modern, high-performance Python web framework for building APIs with automatic OpenAPI documentation.FastAPI is a robust, high-speed Python web framework: it is built on Starlette (for async capabilities) and Pydantic (for data validation and serialization). Leveraging standard Python 3.8+ type hints, the framework automatically generates interactive API documentation (Swagger UI/ReDoc) and enforces data validation, effectively reducing developer-induced errors by an estimated 40%. This architecture delivers performance on par with Node.js and Go, significantly increasing feature development speed (up to 300% faster). It is production-ready, fully supporting OpenAPI and JSON Schema standards for all API specifications.
- PythonPython: The high-level, general-purpose language built for readability, powering everything from web backends to advanced machine learning models.Python is the high-level, general-purpose language prioritizing clear, readable syntax (via significant indentation), ensuring rapid development for any team . Its ecosystem is massive: use it for robust web development with frameworks like Django and Flask, or leverage its power in data science with libraries such as Pandas and NumPy . The Python Package Index (PyPI) provides thousands of community-contributed modules, offering immediate solutions for tasks from network programming to GUI creation . The language is actively maintained by the Python Software Foundation (PSF), with the stable release currently at Python 3.14.0 (as of November 2025) .
- NextNext.js is the full-stack React framework: it delivers high-performance web applications via hybrid rendering and powerful, Rust-based tooling.This is the React Framework for production: Next.js enables you to build full-stack web applications with zero configuration and maximum efficiency. It supports a hybrid rendering approach (Server-Side Rendering, Static Site Generation, and Incremental Static Regeneration) for optimal speed and SEO performance. Key features include React Server Components, Server Actions for running server code directly, and the App Router for advanced routing and nested layouts. Developed by Vercel, it leverages Rust-based tools like Turbopack and the Speedy Web Compiler for the fastest possible builds and a superior developer experience.
- Google GeminiGemini is Google's most capable, multimodal AI model: it seamlessly reasons across text, code, audio, image, and video.Gemini is Google's foundational, multimodal AI model, engineered to natively understand and combine text, code, image, audio, and video inputs. The technology is optimized across three sizes: Ultra (for highly complex tasks), Pro (for broad task scaling), and Nano (for efficient on-device performance). Gemini Ultra, for example, achieved a 90.0% score on the MMLU benchmark, surpassing human experts. It functions as a powerful AI assistant, integrated across Google services like Gmail and Maps, and features advanced tools like Deep Research and custom AI experts (Gems). Its Pro version offers a long context window, handling up to 1,500 pages or 30k lines of code simultaneously.
- FlashFlash was a dominant multimedia software platform (Macromedia, then Adobe) used for creating vector graphics, animation, and Rich Internet Applications (RIAs), officially discontinued on December 31, 2020.Adobe Flash, formerly Macromedia Flash, was the industry standard for delivering interactive web content: animations, browser games, and embedded video players. The platform used the proprietary SWF file format and ActionScript programming language, driving the early 2000s web experience. Despite its widespread adoption, Flash faced increasing criticism for performance issues and persistent security vulnerabilities, notably after Steve Jobs’ 2010 open letter. Open standards like HTML5, WebGL, and WebAssembly provided viable, secure alternatives. Adobe announced its End-of-Life (EOL) in 2017, ceasing support on December 31, 2020, with Flash content blocked from running by January 12, 2021.
- Structured outputsStructured Outputs enforce a predefined schema (e.g., JSON, XML) on Large Language Model (LLM) responses, guaranteeing machine-readable, consistent data for reliable system integration.This technology eliminates the unpredictability of free-form LLM text. Structured Outputs leverage techniques like constrained decoding and Context-Free Grammars (CFG) to strictly limit the model’s token generation, ensuring the output conforms to a user-defined JSON Schema or Pydantic model. This capability is mission-critical for production systems: it ensures 100% format reliability (a massive leap from the ~35% reliability of older 'JSON mode'). Use it for seamless data extraction, reliable function calling, and direct API integration, turning erratic text into clean, verifiable data for your downstream applications.
- BackendStream real-time LLM responses and execute server-side functions through a single, persistent Server-Sent Events connection.This architecture leverages the Vercel AI SDK to bridge the gap between model generation and backend logic. By using streaming endpoints, the system delivers token-by-token text updates while transparently executing tools (like the 'getWeather' or 'queryDatabase' functions) without breaking the user session. This approach eliminates the 500ms overhead typical of standard REST round-trips: it keeps the UI responsive and the state synchronized. It is the gold standard for building interactive agents that need to act on live data while maintaining a conversational flow.
- PydanticPydantic is Python's most-used data validation library: it enforces data schemas using standard type hints and boasts a Rust-core for exceptional speed.Pydantic is the premier data validation and parsing library for Python. It mandates data structure using pure, canonical Python type annotations, drastically reducing boilerplate code. With over 360M monthly downloads, Pydantic is battle-tested: all FAANG companies and major frameworks (FastAPI, SQLModel, LangChain) rely on it for robust data handling. Its core validation logic is written in Rust, ensuring high performance. Pydantic models also generate JSON Schema, facilitating seamless integration and documentation for API development.
Related projects
Building AI Agents with Gemini: Connecting Gemini Enterprise to Snowflake via MCP
Vienna
See a live demo of an AI agent connecting Gemini Enterprise to Snowflake via MCP, querying enterprise data…
Utilizing Synthetic Datasets for Sales prospects
Bremen
Learn how LLMs generate synthetic B2B sales scenarios and tailored insights, enabling fast, practical preparation for sales calls…
A Hybrid Pipeline using PaddleOCR Layout Analysis & LLM Text-to-SQL
Hong Kong
This talk demonstrates a pipeline using PaddleOCR for PDF layout analysis to structure financial data, followed by an…
Genesisx
Santiago
Explores end‑to‑end AI product development: from concept to deployment, with real case studies, architecture decisions, tech stacks, results,…
Building Knowledge Graphs with Gemini
Paris
Learn how to process up to 1‑million‑token texts with Gemini, extract insights, and automatically generate knowledge graphs from…
AI safety in healthcare
Vienna
Learn how to design safer, more deterministic AI for complex industries like healthcare. Discover methods for evaluating AI…