Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
JetBrains AI Codebase Benchmarks
Exploring JetBrains Long Code Arena benchmarks, we'll demonstrate project‑wide code completion and library‑based generation, discuss context strategies, and evaluate new syntax and n‑gram metrics.
We are contributing to an open-source project by JetBrains Research called Long Code Arena (LCA). LCA consists of 6 benchmarks that evaluate how AI models perform in evaluating different aspects of a developer’s entire project. The two benchmarks we have been working on include the project-level code completion and library-based code generation. The project-level code completion uses the full project as context to generate the next line of code in a file. The library-based code generation tests the model’s ability to generate appropriate code relying on library methods. We evaluated several models and measured their performance using key benchmark-specific metrics. More specifically, we employ various techniques to enhance model performance. Some strategies included how we provide the prompts and additional context. Additionally, we contributed more metrics, such as syntax matching and n-gram matching, to assess the model output quality more effectively. Our project is crucial because it enables us to experiment with various context collection techniques based on the source datasets provided by JetBrains.
A Python benchmark suite for large-context code generation and repair tasks.