Landing the Lunar Lander with Human Feedback
Reinforcement Learning from Human Feedback is most commonly associated with the final training stages of large language models like ChatGPT or Claude. It’s what helps these models…
Long reads, experiments and insights. Written by the people who build Rapidata — for the people who train models.
In the past few years, text-to-image models have evolved from DALL-E [7] to Stable Diffusion [8] to more recently Imagen 4 [9]. Early diffusion-based models struggled with simple…
Reinforcement Learning from Human Feedback is most commonly associated with the final training stages of large language models like ChatGPT or Claude. It’s what helps these models…

TL;DR: We collected 1.5 million annotations from >150 thousand individual humans using Rapidata via the Python API to build a dataset of detailed human feedback for text-to-image…
What we’re learning from human feedback at scale.
Text-to-image models like DALL-E 3, Stable Diffusion, MidJourney, and Flux.1 have gained popularity for generating images from text prompts. However, benchmarking these models is…
Object detection in computer vision is a fascinating blend of mathematics, algorithms, and machine learning that allows computers to identify and locate objects within an image or…