Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Centaur: Agentic Decision Prediction
This talk presents an open-source framework for integrating Centaur's human decision prediction model into agentic workflows to reduce reviewer cognitive load.
Building off the human decision prediction model work done by the team behind Centaur. My team has been experimenting with ways to implement this decision prediction model in agentic workflows in order to reduce cognitive load for human reviewers and experts.
- Centaur LLMCentaur is the Llama 3-based LLM engineered for cognitive fidelity, simulating human biases and decision-making with high accuracy (r=0.86) for *in silico* social science.Centaur is a specialized Large Language Model (LLM) developed by researchers from Princeton and Google DeepMind; its core mission is cognitive fidelity, not factual accuracy. The architecture is built on Meta’s Llama 3-8B-Instruct and fine-tuned on the PEER/Psych-101 dataset: over 10 million human responses from 162 classic psychological experiments. This training allows Centaur to replicate the distribution of human answers, including known cognitive biases like the 'conjunction fallacy.' The model demonstrates high predictive power, achieving a correlation of $r=0.86$ with human response patterns on unseen tests. It functions as a critical new instrument for social scientists, enabling rapid *in silico* experimentation on human irrationality.
- Psych 101 DatasetPsych-101 is the massive dataset of human psychological experiment transcripts, detailing over 10 million trial-by-trial choices for cognitive modeling.This is the 'Psych-101' dataset: a comprehensive collection of natural language transcripts from human psychological experiments. It aggregates trial-by-trial data from 160 distinct experiments, capturing the decisions of 60,092 participants. The dataset documents an impressive 10,681,650 individual choices, providing a critical resource for researchers. Use this data to train foundation models, understand complex human decision-making processes, and benchmark new cognitive architectures against real-world human behavior.
- Claude APIAccess Anthropic's state-of-the-art Claude models (Opus, Sonnet, Haiku) via the RESTful Messages API, integrating advanced AI capabilities directly into your applications.The Claude API is Anthropic's direct developer interface for integrating their powerful large language models (LLMs) like Claude 3.5 Sonnet and Opus into production applications. It utilizes a robust Messages API for all conversational and generative interactions, supporting a massive 200,000-token context window for deep document analysis and sustained, complex reasoning. Developers leverage its Constitutional AI framework for built-in safety and utilize key features like Tool Use (function calling) and the Message Batches API for cost-efficient, high-volume processing. This is the direct, pay-as-you-go route for full feature control and cutting-edge model access.
- LangChainThe open-source framework for building and deploying reliable, data-aware Large Language Model (LLM) applications.LangChain is the essential framework for engineering LLM-powered applications: it simplifies connecting models (like GPT-4 or Claude) to external data, computation, and APIs. The platform provides a modular set of components—Chains, Agents, Tools, and Memory—allowing developers to quickly build complex workflows like Retrieval-Augmented Generation (RAG) pipelines and sophisticated conversational agents. Its Python and JavaScript libraries, combined with LangChain Expression Language (LCEL), offer a standardized interface for rapid prototyping and moving applications to production with confidence.
- ReplitReplit is the AI-powered, cloud-based development environment: go from natural language idea to deployed full-stack application in minutes, with zero setup.Replit is the definitive cloud-based development environment (CDE), enabling developers and teams to bypass complex local setup entirely. It supports hundreds of languages (e.g., Python, Node.js, C++) and features real-time, Google Docs-style collaboration for seamless pair programming. The core differentiator is the integrated Replit Agent: an AI developer that scaffolds, codes, and debugs full-stack applications from natural language prompts, accelerating the 'idea-to-app' cycle to minutes. Projects benefit from built-in version control (Git/GitHub integration) and one-click deployment to production, often leveraging Google Cloud infrastructure.
Related projects
Curation Assistants
Toronto
This talk introduces simplified AI assistants using the latest OpenAI technology for easy pre-meeting introductions, functioning as a…
WhimsyPaws
Toronto
A child-friendly game with an AI companion that logs interactions, uses LLMs to infer emotions, and explores privacy…
HypeDocs Employee Summaries
Toronto
This talk explores AI use cases for employee performance data, combining goals, accomplishments, and manager notes to generate…
Building an Empathetic AI agent
Toronto
The talk demonstrates how an Empathic AI Coach generates context‑aware clarifying questions using reinforcement learning with human feedback,…
Distilling Chaos: Practical RAG Systems for Real-World Decision Workflows
Toronto
Learn production RAG system architecture for millions of documents, hybrid search strategies, and how human feedback improves performance…
Collaborating Agents
Toronto
The talk explains how blockchain can record AI agents' actions, enabling transparent, auditable collaboration and verifiable decision‑making across…