Bring human insight into your model checkpoints—before it's too late.
Model developers currently rely on specialised pre-trained evaluation models. However, these automated tools only emulate direct human feedback
By the time you will get some real people using your product, it's already too late
Whenever you defined a checkpoint, the models stops to generate the images that will be used as a benchmark against 4o
Rapidata will take care of automating the pipelines needed to gather human feedback and provide you with insights
When the results are ready, they will be directly visualized in your Weights & Biases dashboard
import wandb
from checkpoint_evaluation.image_checkpoint_evaluator import ImageEvaluator
# Initialize wandb
run = wandb.init(project="my-project")
# Create evaluator
evaluator = ImageEvaluator(wandb_run=run, model_name="my-model")
# In your training loop
for step in range(100):
# ... your training code ...
# Generate or load validation images (every N steps)
if step % 10 == 0:
# Fire-and-forget evaluation - returns immediately!
evaluator.evaluate(generate_images())
# ... continue training ...
# Wait for all evaluations to complete before finishing
evaluator.wait_for_all_evaluations()
run.finish()
pip install crowd-eval