High Frequency Trading - Statistical Arbitrage | Hong Kong .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

January 20, 2026 · Hong Kong

HFT Statistical Arbitrage Models

This talk details statistical model optimization using past order book data, a key metric, and converting model output into high-frequency trades.

Overview
Links
Tech stack
  • Python
    Python: The high-level, general-purpose language built for readability, powering everything from web backends to advanced machine learning models.
    Python is the high-level, general-purpose language prioritizing clear, readable syntax (via significant indentation), ensuring rapid development for any team . Its ecosystem is massive: use it for robust web development with frameworks like Django and Flask, or leverage its power in data science with libraries such as Pandas and NumPy . The Python Package Index (PyPI) provides thousands of community-contributed modules, offering immediate solutions for tasks from network programming to GUI creation . The language is actively maintained by the Python Software Foundation (PSF), with the stable release currently at Python 3.14.0 (as of November 2025) .
  • Binance API
    The Binance API provides REST and WebSocket endpoints for programmatic access to market data, automated trading, and account management across Spot, Futures, and Margin products.
    This is the official gateway for developers to integrate with the world's largest crypto exchange by volume. The API utilizes both REST for requests (e.g., placing a limit order for BTCUSDT) and WebSocket for real-time data streaming (e.g., price quotes, account updates). Key functionality includes full trading automation, comprehensive account management (checking balances, transaction history), and access to deep market data (order book depth, k-line/candlestick data). It supports a diverse product suite: Spot, Margin, Futures, and Options trading, enabling high-frequency strategies and custom financial applications.
  • Machine Learning
    Compute dense vector representations for sentences and paragraphs using a Python framework that optimizes transformer models for semantic similarity.
    Sentence Transformers (SBERT) maps variable-length text into fixed-size, high-dimensional vectors to enable high-speed semantic analysis. The framework fine-tunes architectures like BERT using Siamese and Triplet network structures: this ensures semantically similar inputs cluster together in vector space. This methodology powers critical NLP tasks: semantic search, clustering, and paraphrase detection. Standard models (like all-MiniLM-L6-v2) provide a 384-dimensional baseline for real-time applications, while larger variants (such as all-mpnet-base-v2) offer superior accuracy for complex information retrieval.
  • Data Visualization
    Transform complex datasets (Big Data) into clear visual elements (charts, graphs, maps) for rapid pattern recognition and data-driven decision-making.
    Data Visualization converts massive, complex data streams (e.g., trillions of rows) into accessible graphical representations: think dashboards, line charts, and heat maps. This process is critical for quickly identifying trends, outliers, and relationships that text-based reports miss. It empowers non-technical users to internalize insights rapidly, driving smarter business actions (e.g., a 15% faster identification of a production bottleneck). The goal is efficient communication: telling a clear, compelling story with data, moving past raw numbers to actionable intelligence.

Related projects