Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Dlab-852-Mini: Hong Kong Cultural AI
Showcasing Dlab-852-Mini, a Phi-3 fine-tune for Hong Kong culture using the CultureKit eval, detailing training, evaluation, and case studies.
I’ll showcase our work at Decisions Lab on Dlab-852-Mini, a specialized fine-tuned version of Microsoft’s Phi-3.5-mini-instruct model designed to accurately replicate and align with Hong Kong’s unique cultural perspectives, attitudes, and behaviors. Drawing from our cultural alignment research, we’ll explore how we created the CultureKit eval, a suite of CLI for assessing cultural biases in LLMs, to train and evaluate the model. I’ll cover the fine-tuning process, benchmark results demonstrating up to 2x better performance in simulating local responses compared to base models, and practical case studies.
Project URL - Website or Github (Optional)
https://github.com/decisionslab/culturekit
Project URL 2 (Optional)
Video Demo URL (Optional)
https://www.youtube.com/watch?v=example-demo (replace with actual if available; otherwise, omit)
Benchmarks LLM cultural dimensions using CD Eval across MLX/Azure platforms.
MLX-optimized Phi-3.5 fine-tune simulates Hong Kong culture with CDEval validation.
- PythonPython: The high-level, general-purpose language built for readability, powering everything from web backends to advanced machine learning models.Python is the high-level, general-purpose language prioritizing clear, readable syntax (via significant indentation), ensuring rapid development for any team . Its ecosystem is massive: use it for robust web development with frameworks like Django and Flask, or leverage its power in data science with libraries such as Pandas and NumPy . The Python Package Index (PyPI) provides thousands of community-contributed modules, offering immediate solutions for tasks from network programming to GUI creation . The language is actively maintained by the Python Software Foundation (PSF), with the stable release currently at Python 3.14.0 (as of November 2025) .
- datasetsDatasets is the core ML library for accessing, sharing, and processing thousands of AI-ready datasets (NLP, CV, Audio) with a single, efficient line of code.Datasets is the essential utility for modern machine learning workflows: it provides a unified API for data access and preprocessing. The library allows engineers to load over 350,000 datasets (SQuAD, Common Crawl, etc.) directly from the Hugging Face Hub. It leverages an Apache Arrow backend to ensure zero-copy reads, enabling efficient handling of massive datasets without RAM constraints. This architecture streamlines data preparation, making it fast and scalable for training state-of-the-art models across various domains.
- Phi-3Microsoft's family of small language models (SLMs) delivering high-reasoning performance on local devices and edge hardware.Phi-3-mini packs 3.8 billion parameters into a footprint small enough for local deployment on an iPhone 14. Trained on a 3.3 trillion token dataset of high-quality synthetic data and filtered web content: it outperforms models twice its size (like Mixtral 8x7B) on benchmarks for coding and logic. The family includes 7B (small) and 14B (medium) variants, providing developers with low-latency options for complex tasks without the massive compute requirements of traditional LLMs.
- MLXMLX is Apple's high-performance array framework for machine learning on Apple silicon, leveraging unified memory for zero-copy efficiency.MLX is an open-source array framework from Apple machine learning research, purpose-built for efficient ML on Apple Silicon (M-series chips). Its core strength is the unified memory model: this eliminates costly data transfers between the CPU and GPU, a major performance bottleneck in traditional frameworks. The API is immediately familiar, closely mirroring NumPy for array operations and PyTorch for higher-level packages like `mlx.nn` and `mlx.optimizers`. It supports Python, C++, C, and Swift bindings, making it highly flexible. Researchers use MLX to quickly train and deploy complex models, with examples including large-scale text generation with LLaMA and image creation via Stable Diffusion.
Related projects
From Abacus to AI: Small Potatoes, Big Results
Hong Kong
See how a non-coder built World of Warcraft addons, a bus app, and a phone ringer using AI…
Real-Time Snooker Table Understanding with PyTorch, MobileNet-SSD, and Jetson Devices
Hong Kong
This talk details building a real-time snooker understanding system using PyTorch and MobileNet-SSD on Jetson Nano, covering data…
Local hosting - sometimes joy can come in small packages
Hong Kong
Explore building small, quiet local AI inferencing systems, covering Windows and WSL hosting, plus a live demo of…
AI for Cultural Preservation
Hong Kong
This talk explores converting complex East Asian historical texts into AI-ready data using custom OCR, generative AI, and…
A Hybrid Pipeline using PaddleOCR Layout Analysis & LLM Text-to-SQL
Hong Kong
This talk demonstrates a pipeline using PaddleOCR for PDF layout analysis to structure financial data, followed by an…
Facts + Citations” to “Futures + Scenarios”: Building an LLM Scenario Copilot for Company Risk and Opportunity
Hong Kong
Explore building an LLM copilot for generating divergent future scenarios, identifying risks and opportunities, and suggesting concrete actions…