ExamplesGSM8KGrade school math word problems benchmarkGSM8K Difficulty: Intermediate | File: examples/02_gsm8k.py Grade school math word problems requiring multi-step reasoning. Usage python examples/02_gsm8k.py --lm anthropic/claude-haiku-4-5-20251001 --demoPreviousBoolQNextBig-Bench Hard