Migration from GEPA
Migrating from DSPy's GEPA optimizer to Vizpy, or stacking PromptGrad on top
Migration from GEPA
GEPA is DSPy's evolutionary optimizer for prompt instructions. It generates multiple instruction variants, evaluates them on your training set, uses an LLM to reflect on failures, and selects the best-performing variants — repeating this across several generations.
GEPA is good at broad search: finding a substantially better starting instruction when you don't know what the right prompt looks like. It explores a wide space and can make large jumps in performance early on.
Where GEPA plateaus: targeted refinement. Once it's found a reasonable instruction, the evolutionary search doesn't have a mechanism to accumulate specific rules from failure analysis — each generation starts fresh rather than building on what was learned.
Vizpy's PromptGradOptimizer is designed for exactly this stage.
Two Migration Paths
Path 1: Replace GEPA entirely
If you're not getting meaningful gains from GEPA after the first few generations,
replace it with PromptGradOptimizer. It starts from your module's existing instructions
and accumulates targeted improvements from batch failure analysis.
Path 2: Use GEPA as initialization, PromptGrad for refinement
This is the recommended path if GEPA has already found a reasonable base instruction.
PromptGradOptimizer accepts base_prompt_source="gepa" — it runs GEPA internally
to get a strong starting point, then applies gradient-based refinement on top.
API Changes
Before (GEPA):
After (Vizpy, Path 1 — full replacement):
After (Vizpy, Path 2 — GEPA base + PromptGrad refinement):
Full Example: Changelog Generation
This task benefits from the two-stage approach. GEPA is effective at discovering that user-facing language is needed. PromptGrad then accumulates specific rules about vocabulary substitution (e.g. "session middleware" → "during checkout").
When the Two-Stage Approach Wins
The reason to stack GEPA + PromptGrad rather than using either alone:
- GEPA explores broadly and can discover that the entire register needs to shift (e.g. "write as user impact, not code description"). It makes the big jump.
- PromptGrad then accumulates specific rules from failure patterns: specific vocabulary substitutions, exception cases, edge conditions that the broad GEPA instruction doesn't handle.
The result is instructions that are globally correct (GEPA) and locally precise (PromptGrad). You get the exploration benefit of evolutionary search and the precision benefit of gradient-based refinement.
From the research backing these optimizers: this two-stage architecture on HotPotQA improved normalized performance from the GEPA baseline of +80% to a combined +126% — the additional PromptGrad refinement stage contributed the remaining gain.
When to Use GEPA Alone vs. Full Replacement
| Situation | Recommendation |
|---|---|
| GEPA has plateaued and adding more generations doesn't help | Replace with PromptGradOptimizer |
| GEPA found a strong base but plateaued | Use base_prompt_source="gepa" |
| You want interpretable, accumulated rules | PromptGradOptimizer either way |
| You have < 20 training examples | ContraPromptOptimizer is faster and more sample-efficient |
| Task has clear contrastive pairs (right vs. wrong label) | ContraPromptOptimizer instead of GEPA path |