Teaching AI to read ancient Maya Glyphs | Montreal .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

October 21, 2025 · Montreal

Teaching AI Maya Glyphs

Learn how computer‑vision models detect Maya glyph blocks, classify individual signs, and suggest readings using a curated epigraphic corpus, addressing data limits and ethical concerns.

Video
Overview
Tech stack
  • YOLOv8
    YOLOv8 is the state-of-the-art (SOTA) model from Ultralytics, delivering superior speed and accuracy across real-time object detection, instance segmentation, and image classification tasks.
    This is the latest iteration in the You Only Look Once (YOLO) series, developed by Ultralytics and released in 2023. YOLOv8 leverages an anchor-free detection head and an enhanced PANet neck: this architectural shift simplifies the process, reducing hyperparameters and boosting performance over previous versions like YOLOv5. It provides a unified, highly efficient framework supporting multiple vision AI tasks (detect, segment, classify, pose), making it ideal for demanding, real-world applications (e.g., autonomous vehicle systems, advanced security). The model is easily accessible via a robust Python package and a clean Command Line Interface (CLI).
  • ResNet
    ResNet (Residual Network) uses skip connections to enable the effective training of ultra-deep neural networks (e.g., 152 layers), winning the ILSVRC 2015 classification task with a 3.57% top-5 error.
    ResNet, short for Residual Network, is a foundational convolutional neural network (CNN) architecture introduced by Microsoft Research in 2015. Its core innovation is the residual block, which incorporates 'skip connections' (identity mappings) to bypass one or more layers, adding the input directly to the block's output. This design explicitly reformulates layers to learn the *residual* function, effectively solving the degradation problem and vanishing gradient issues that plagued previous deep networks. This breakthrough allowed researchers to train networks with unprecedented depth—up to 152 layers—while maintaining high accuracy. ResNet secured first place in all five main tracks of the ILSVRC & COCO 2015 competitions (classification, detection, localization, and segmentation), establishing a new state-of-the-art benchmark for computer vision.
  • Grad-CAM
    Grad-CAM (Gradient-weighted Class Activation Mapping) generates class-discriminative heatmaps, pinpointing the exact image regions a Convolutional Neural Network (CNN) used to make its prediction.
    This is Grad-CAM: your solution for robust CNN explainability. The technique generates a coarse localization heatmap by calculating the gradient of the target class score with respect to the final convolutional layer's feature maps. This process provides a critical visual explanation, highlighting the input regions (pixels) that drive the model's decision for a specific class. Its key advantage: Grad-CAM works across a wide range of CNN architectures (e.g., VGG, ResNet) without requiring any architectural changes or re-training, ensuring maximum compatibility and immediate deployment.
  • PyTorch
    PyTorch is the open-source machine learning framework: it provides a Python-first tensor library with strong GPU acceleration and a dynamic computation graph for building deep neural networks.
    PyTorch, developed by Meta AI, is a premier open-source deep learning framework favored in both research and production environments. Its core is a powerful tensor library (like NumPy) optimized for GPU acceleration, delivering 50x or greater speedups for complex computations. The key differentiator is its 'Pythonic' design and dynamic computation graph (eager execution), which allows for rapid prototyping and simplified debugging compared to static-graph frameworks. Leveraging its Autograd system for automatic differentiation, practitioners build and train models for computer vision and NLP; major companies like Tesla (Autopilot) and Microsoft utilize PyTorch for critical AI applications.
  • FAISS
    FAISS (Facebook AI Similarity Search): The open-source Meta AI library for high-performance, billion-scale similarity search and clustering of dense vectors.
    FAISS (Facebook AI Similarity Search) is the core open-source library from Meta AI for efficient similarity search and clustering of dense vectors. It is engineered for billion-scale datasets, delivering state-of-the-art performance by optimizing the memory-speed-accuracy tradeoff. The library leverages advanced indexing structures (e.g., IVF, HNSW, Product Quantization) to manage and query high-dimensional data. Crucially, it provides complete C++ and Python wrappers, with CUDA-enabled GPU implementations that deliver a significant speedup (often 5x to 10x) over CPU-only operations. This makes it the go-to tool for large-scale applications: recommendation systems, image retrieval, and anomaly detection.
  • Annoy
    Spotify’s C++ library with Python bindings for memory-efficient approximate nearest neighbor searches in high-dimensional spaces.
    Annoy (Approximate Nearest Neighbors Oh Yeah) optimizes similarity searches by building static file-based index structures that are shared across processes via mmap. This architecture allows systems to handle millions of vectors (like 100-dimensional song embeddings) without bloating RAM usage. It uses random projections to create a forest of trees, balancing search speed and accuracy through a single parameter: the number of trees. Whether you are serving music recommendations to 500 million users or clustering research data, Annoy delivers sub-millisecond lookups and a minimal memory footprint.
  • hnswlib
    HNSWlib is a header-only C++/Python library for fast, scalable Approximate Nearest Neighbor (ANN) search.
    HNSWlib implements the Hierarchical Navigable Small Worlds (HNSW) algorithm, delivering highly efficient vector similarity search for high-dimensional data. This library achieves sublinear time complexity by organizing data points into a multi-layered graph structure, enabling quick jumps between layers for coarse-to-fine-grained exploration. It is a header-only C++ library with robust Python bindings, making it ideal for prototyping and production use cases involving millions of data points, such as Retrieval-Augmented Generation (RAG) and recommendation systems. Key parameters like 'M' (connections per element) and 'efConstruction' (search scope during index building) allow fine-tuning the speed-accuracy trade-off.
  • Milvus
    Milvus is the high-performance, open-source vector database built for scalable vector Approximate Nearest Neighbor (ANN) search in GenAI applications.
    Milvus is a cloud-native, distributed vector database designed to manage, index, and query billions of high-dimensional embedding vectors with speed and efficiency. Developed by Zilliz and an LF AI & Data Foundation project (Apache 2.0 License), its architecture separates storage and compute, enabling horizontal scaling for massive datasets and real-time streaming updates. It supports diverse index types (HNSW, IVF, etc.) and is a core component for modern AI workloads like Retrieval Augmented Generation (RAG), recommendation systems, and image retrieval.
  • Weaviate
    Weaviate: The open-source, AI-native vector database for high-performance hybrid search, scaling to billions of objects for RAG and semantic applications.
    Weaviate is the open-source, cloud-native vector database built for AI-first applications. It stores both objects and vectors, enabling high-performance hybrid searches (vector similarity plus structured filtering) across billions of data points. Engineered in Go for speed and reliability, it seamlessly integrates with major vectorizers (like OpenAI and HuggingFace) and supports critical use cases: RAG (Retrieval-Augmented Generation), recommendation engines, and high-scale semantic search. With over 20 million downloads and SDKs for Python, Go, and TypeScript, Weaviate is the production-ready core for your generative AI stack.
  • Pinecone
    Pinecone is the leading, cloud-native vector database for building high-performance, knowledgeable AI applications (RAG, semantic search) at production scale.
    Pinecone is the specialized vector database engineered for AI applications, founded in 2019 by Edo Liberty. Its core function is managing and querying high-dimensional vector embeddings at scale, using Approximate Nearest Neighbor (ANN) search for rapid similarity matching. The platform offers a fully managed, cloud-native architecture, including a Serverless option that scales automatically and charges only for data stored and operations performed. Key features include hybrid search (combining sparse and dense vectors), real-time indexing, and enterprise-grade security: it is SOC 2, GDPR, and HIPAA certified. Companies like Gong and Vanguard leverage Pinecone to power their intelligent systems, achieving faster, more accurate retrieval for applications like customer support and smart tracking.
  • Qdrant
    Qdrant is an open-source, Rust-powered vector database and search engine: it delivers high-performance, scalable similarity search for AI applications.
    Qdrant functions as a production-ready vector database, purpose-built in Rust for unmatched speed and reliability, even when processing billions of high-dimensional vectors. It provides a convenient API to store, search, and manage vector embeddings (points) along with optional metadata (payloads). Key features include advanced filtering on those payloads, support for multiple distance metrics (Cosine, Dot Product, Euclidean), and cloud-native scalability. Developers leverage Qdrant for critical AI workloads like Retrieval-Augmented Generation (RAG) systems and large-scale recommendation engines, deploying via Docker, self-hosting, or the managed Qdrant Cloud service.
  • Chroma
    Chroma is the open-source vector database engineered for AI: it simplifies the storage and retrieval of vector embeddings for large language models (LLMs).
    Chroma functions as the critical memory layer for modern Generative AI applications, specifically powering Retrieval-Augmented Generation (RAG). It stores vector embeddings (numerical representations of unstructured data like text or images) and associated metadata. This architecture enables low-latency, high-accuracy similarity searches using metrics like cosine distance. Developers can deploy it locally or use the managed Chroma Cloud, leveraging Python and JavaScript/TypeScript SDKs for rapid prototyping and production-scale LLM context retrieval.

Related projects