Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
JP-TL-Bench: AI Paper Writing
Learn how to leverage AI tools to efficiently write an arXiv paper detailing a novel Japanese/English translation evaluation methodology in a single day.
Recently, we open-sourced several of the most useful evals we used for developing our Shisa V2 models. One of the most useful was JP-TL-Bench, our Japanese/English translation eval. It’s notable because it introduces a brand new methodology for doing better scoring (combining the discriminative power of pair-wise completions, but avoiding both the quadratic scaling and the score drift that come with normal scoring, like ELO). It’s worth writing a paper about. How can we best use AI to help us write a paper efficiently, without it being complete slop?
JP-TL-Bench uses anchored pairwise LLM comparison and Bradley-Terry modeling for discriminating JPN-ENG translation.