Recipe Difficulty Rating
Fix a model that thinks "5 ingredients = Easy"
Recipe Difficulty Rating
LLMs rate recipe difficulty by the wrong signals. Five ingredients, short prep time, few steps — that must be Easy, right? Except beef wellington is five steps and genuinely hard. The model has no concept of technique difficulty — it can't tell the difference between "chop onions" and "julienne carrots into 2mm matchsticks while keeping them cold so they don't wilt."
This example uses ContraPromptOptimizer to teach the model what actually makes a recipe hard: knife technique, timing parallelism, temperature precision, and the gap between reading a technique and executing it under pressure.
Optimizer: ContraPromptOptimizer
Difficulty: Beginner
The Failure Mode
The model counts steps and ingredients. It misses that:
- Getting beef to exactly medium-rare through pastry requires a probe thermometer and experience
- Duxelles must be cooked completely dry or the pastry goes soggy — a technique that takes feel, not just instructions
- You have to rest the beef twice at specific temperatures, coordinated with pastry timing
Full Example
What the Optimizer Discovers
The contrastive pairs here are particularly rich: a 3-ingredient hollandaise rates Hard, while a 10-ingredient banana bread rates Easy. The optimizer sees these pairs and extracts rules that override the surface-level heuristics:
"Rate based on technique precision, not ingredient count. Key hard signals: narrow temperature windows, emulsification steps, simultaneous timing coordination, and any technique where failure produces an unrecoverable result."
The error_type field (underrated / overrated) enables stratified sampling — the
optimizer sees both directions of error in each batch instead of just the more common one.