Vizpy

GSM8K

Grade school math word problems benchmark

GSM8K

Difficulty: Intermediate | File: examples/02_gsm8k.py

Grade school math word problems requiring multi-step reasoning.

Usage

python examples/02_gsm8k.py --lm anthropic/claude-haiku-4-5-20251001 --demo

On this page