Loading...

{{ error }}

OpenDataArena

Fair, Open, and Transparent Arena for Data — Benchmarking Dataset Value

✨ Make every dataset measurable, comparable, and verifiable for LLM post-training.

✨ Leading benchmarks across general, math, code, science, and more domains.

✨ Unified evaluation pipeline, open-source scoring tools, visualized leaderboards.

✨ Community collaboration and methodological innovation for a better data-centric paradigm.

  • Quickly assess dataset quality
  • Select the right data for your task
  • Evaluate the quality of your own created or synthetic data
  • Use ready-to-go tools for scoring your data
  • Build better experimental subsets with confidence
  • Join a community shaping the future of data value

News

What’s Coming

Short-Term Plans
  • Release seed datasets with built-in support for data filtering and subset selection
  • Benchmark Qwen3 and other latest LLMs with standardized evaluation results
Mid-to-Long Term Plans
  • Monthly update dataset rankings , provide the latest evaluation results.
  • Expand to domain-specific datasets, including medical, scientific, and other high-value fields
  • Support multimodal and reasoning-intensive datasets, enabling richer evaluation scenarios
  • Introduce more data scoring factors and evaluation methods, including automatic, human-aligned, and task-aware metrics.

Our Contributors

Thanks to these outstanding researchers and developers for their contributions to OpenDataArena

4+
Domains
20+
Benchmarks
15+
Score Dimensions
600+
Trainings
10K+
Evaluations
100+
Datasets
10M+
Data Points