Gemini: Production ESG KPI Extraction

Learn to build a production ESG KPI extraction pipeline. This talk covers cost control with Gemini API caching, parallel processing, structured output validation, and multi-document conflict resolution.

Overview

A practical deep-dive into building an AI-powered system that extracts 170+ structured KPIs from ESG/financial PDF documents. I’ll walk through the real engineering decisions behind our two-stage extraction pipeline: how we use Gemini’s Files API with explicit caching to control costs, parallel structured outputs processing to speed up extraction, LLM-based conflict resolution for multi-document scenarios and how we evaluate the pipeline. Expect code snippets, architecture diagrams, and honest lessons learned from development.

Links

Tech stack