4 guides · 2 contributors12 minutes of reading, totalLast update · Nov 22, 2024

Use
Cases

End-to-end walkthroughs of the Rapidata SDK. Real datasets, real code, real human annotators — the recipes we wish existed when we were building this stuff.

№ 01the lead

Freetext: How to label unknown classes

In the last post, we re-annotated the Animals-10 dataset using the given 10 categories, plus Something Else. It turned out, quite a few images ended up in this mystery bucket.…

Marian KannwischerNov 22, 20243 min
Open the guide
№ 02second read

Fixing public datasets — Re-annotating Animals 10

Public datasets are surely handy, but are they always dandy? We found that many large public datasets contain some amount of erroneous data, however, scrolling through thousands…

Marian KannwischerOct 31, 20243 min
№ 03third read

On-Demand Human Preference Data for AI Training

In a previous blog post and paper we presented a benchmark for evaluating generative text-to-image models based on a collected large scale preference dataset consisting of more…

Mads Kuhlmann-JoergensenOct 22, 20245 min
// newsletter · monthly

One email a month.

What we’re learning from human feedback at scale.

~2,800 ML practitioners · no spam · unsubscribe in one click
The Index1 more entry