OpenDataArena
GitHubFair, Open, and Transparent Arena for Data — Benchmarking Dataset Value
✨ Make every dataset measurable, comparable, and verifiable for LLM post-training.
✨ Leading benchmarks across general, math, code, science, and more domains.
✨ Unified evaluation pipeline, open-source scoring tools, visualized leaderboards.
✨ Community collaboration and methodological innovation for a better data-centric paradigm.
- Quickly assess dataset quality
- Select the right data for your task
- Evaluate the quality of your own created or synthetic data
- Use ready-to-go tools for scoring your data
- Build better experimental subsets with confidence
- Join a community shaping the future of data value
News
- 2025.07.28|Data scoring framework open-sourced , supporting LLM-based judgment and diverse evaluation metrics for automated data assessment. Detailed Wiki introduction is provided.
- 2025.07.27|Multi-dimensional data scoring results released, in collaboration with OpenDataLab, empowering deeper insights into dataset quality.
- 2025.07.26|Full training and evaluation toolkit open-sourced, supporting reproducible experiments on mainstream models and benchmarks.
- 2025.07.25|OpenDataArena v1.0 launched, enabling post-training data validation across multiple domains, tasks, and evaluation dimensions.
What’s Coming
- Release seed datasets with built-in support for data filtering and subset selection
- Benchmark Qwen3 and other latest LLMs with standardized evaluation results
- Monthly update dataset rankings , provide the latest evaluation results.
- Expand to domain-specific datasets, including medical, scientific, and other high-value fields
- Support multimodal and reasoning-intensive datasets, enabling richer evaluation scenarios
- Introduce more data scoring factors and evaluation methods, including automatic, human-aligned, and task-aware metrics.
Our Contributors
Thanks to these outstanding researchers and developers for their contributions to OpenDataArena