Rank
Creator
model
Bradley-Terry
Elo
Wins
Matches
1
ai logo
Google DeepMind
veo3
1451.121135.482958449259
2
ai logo
OpenAI
sora
1029.031010.324312784262
3
ai logo
Pika Art
pika2.2
1011.621004.124473788682
4
ai logo
Google DeepMind
veo2
984.75994.322310346464
5
ai logo
Pika Art
pika
959.29984.774911099592
6
ai logo
Runway
alpha
942.38978.272140344259
7
ai logo
Tencent
hunyuan
939.16977.054109584072
8
ai logo
Alibaba
wan2.1
917.28968.77707815789
9
ai logo
Luma Labs
ray2
864.69946.9258292126290

What is "Bradley-Terry"?

The Bradley-Terry ranking model is a probabilistic model used to predict outcomes in pairwise comparisons. It assigns a strength parameter (reported score) to each item, indicating its likelihood of winning against another. See the wikipedia article for mathematical details.

What do we consider as "Overall preference"?

Here we evaluate the model across all criteria and determine which model has the best overall performance.

All results are directly based on feedback from real human raters. The process of how we came out with results is best described in our blog post.

Examples

Visual examples of the annotators’ preferences

Preference
Which video do you find better looking?
Sora
HunyuanVideo
Coherence
Which video feels less weird or unnatural for its style when you look closely? I.e. fewer odd or strange-looking objects or elements
HunyuanVideo
Sora
Alignment
Which image is more aligned with and better adheres to the prompt:
A 'day in the life' of a cutting-edge urban dance crew rehearsing for a global competition. Showcase their intense training, creative choreography sessions, high-tech stage setups, team dynamics, and the adrenaline-filled moments of their electrifying performance under the city lights.
HunyuanVideo
Sora