Benchmarks
Comprehensive performance evaluations of The Token Company compression API. Each benchmark provides detailed methodology, statistical analysis, and reproducible results.
Making LLMs understand financial documents better
Compression improved financial QA accuracy by 2.7 percentage points on 150 SEC filing questions — while reducing input tokens by up to 20%.
February 2026
Reducing LLM response times through compression
Up to 37% faster on Claude Opus 4.6 and 30% on GPT-5.2 — saving seconds per request across 5 input sizes with sub-120ms compression overhead.
February 2026
We are working on updating the benchmarks to include more models and domains using our next generation of compression models.
Why benchmark?
Token compression must balance efficiency with quality. Removing too many tokens risks degrading model performance, while removing too few limits cost savings.
Our benchmarks use rigorous statistical methodology — hundreds of measurements per configuration, bootstrap analysis, and transparent reporting of both methodology and limitations.
Every result is reproducible. We publish the exact configurations, datasets, and evaluation criteria so you can verify our claims.