
Fixing public datasets — Re-annotating Animals 10
Public datasets are surely handy, but are they always dandy? We found that many large public datasets contain some amount of erroneous data, however, scrolling through thousands…
End-to-end walkthroughs of the Rapidata SDK. Real datasets, real code, real human annotators — the recipes we wish existed when we were building this stuff.
In the last post, we re-annotated the Animals-10 dataset using the given 10 categories, plus Something Else. It turned out, quite a few images ended up in this mystery bucket.…

Public datasets are surely handy, but are they always dandy? We found that many large public datasets contain some amount of erroneous data, however, scrolling through thousands…

In a previous blog post and paper we presented a benchmark for evaluating generative text-to-image models based on a collected large scale preference dataset consisting of more…
What we’re learning from human feedback at scale.
As detailed in our LinkedIn post, we collected more than 6000 votes from people around the world finding that DALL-E 3 better portrays the explicit sentiment and emotion in this…