ISON Performance at a Glance
What does this mean?
Think of tokens like words in a book. If JSON is a 100-page book, ISON tells the same story in just 28 pages. AI can read the shorter book faster, understand it just as well, and have room to think about more things at once. It's like packing for vacation - ISON fits everything in a carry-on, while JSON needs a big suitcase!
What We Tested
We asked an AI 300 questions about data stored in 4 different formats. This is like giving the same test to 4 students who each studied from different textbooks.
Questions Asked
Different Datasets
Formats Compared
Question Types
The Formats We Compared
| Format | What It Is | Like... |
|---|---|---|
| ISON | Table-based format designed for AI | A neat spreadsheet |
| TOON | Another compact format | Shorthand notes |
| JSON Compact | JSON with no extra spaces | A telegram message |
| JSON | Standard formatted JSON | A formal letter |
Technical Details
- Tokenizer: o200k_base (same as GPT-4o and GPT-5)
- LLM: DeepSeek (deepseek-chat) with temperature=0
- Validation: Type-aware deterministic comparison (no LLM judge)
- Date: December 25, 2025
Token Results: ISON Uses Far Fewer Tokens
Tokens are the "units" that AI uses to read and understand text. Fewer tokens = faster processing, lower costs, and more room for context.
Why does this matter?
Imagine you can only carry 100 marbles in your backpack. With JSON, each piece of information takes up 4 marbles. With ISON, it only takes 1 marble. So with ISON, you can carry 4 times as much information! This is why AI can understand more context and give better answers when using ISON.
ISON Won Every Single Dataset
Across all 20 different datasets - from small 5-record tables to large 100-record datasets - ISON used the fewest tokens. Every. Single. Time.
| Dataset Size | ISON | JSON | Savings |
|---|---|---|---|
| Small (5 users) | 67 tokens | 253 tokens | 73.5% saved |
| Medium (25 users) | 386 tokens | 1,465 tokens | 73.7% saved |
| Large (100 users) | 1,311 tokens | 5,749 tokens | 77.2% saved |
Technical Insight
ISON's efficiency increases with dataset size. This is because ISON's table structure eliminates repeated key names. In JSON, every single record repeats "id", "name", "email", etc. In ISON, these appear once as column headers. The more records you have, the more savings you get.
Accuracy Results: ISON Answers Just As Well
Saving tokens doesn't help if the AI can't understand the data. Good news: ISON achieves excellent accuracy!
| Format | Correct Answers | Accuracy |
|---|---|---|
| JSON Compact | 267 / 300 | 89.0% |
| TOON | 266 / 300 | 88.7% |
| ISON | 265 / 300 | 88.3% |
| JSON | 254 / 300 | 84.7% |
Accuracy by Question Type
We tested 6 different types of questions to make sure ISON works well for all use cases:
Retrieval
Counting
Aggregation
Relationships
Edge Cases
Filtering
What this tells us
Think of it like a spelling test. All four formats got about the same score - around 85-89%. But ISON studied from a much shorter book! Getting the same grade while reading less material means ISON is working smarter, not harder.
The Efficiency Score: Where ISON Truly Shines
The real magic happens when we combine tokens AND accuracy. We call this "Accuracy per 1,000 Tokens" (Acc/1K) - how much accuracy you get for each 1,000 tokens spent.
ISON Efficiency
TOON Efficiency
JSON Compact
JSON (baseline)
The Math
Acc/1K = (Accuracy %) / (Tokens / 1000)
For ISON: 88.3% / 3.55 = 24.88
For JSON: 84.7% / 12.67 = 6.68
This means ISON delivers 3.7x more accuracy per token than JSON!
Real-world example
If you have $100 to spend on AI tokens:
- With JSON, you can ask about 79 datasets
- With ISON, you can ask about 282 datasets
That's 3.6x more value for your money!
How We Ran This Benchmark
Good science requires good methodology. Here's exactly how we tested:
1 Created 20 Diverse Datasets
From simple user tables to complex multi-table relationships. Sizes ranged from 5 to 100 records. Included edge cases like null values and special characters.
2 Converted to All 4 Formats
Each dataset was converted to ISON, TOON, JSON Compact, and JSON. Token counts measured with the exact tokenizer used by GPT-4o.
3 Wrote 300 Questions
15 questions per dataset covering retrieval, counting, aggregation, filtering, relationships, and edge cases. Each question has a known correct answer.
4 Asked a Real LLM
Used DeepSeek API with temperature=0 for reproducible results. Same prompt template for all formats. No cherry-picking - all 300 answers recorded.
5 Type-Aware Validation
Answers validated by type (string, number, boolean, list) with appropriate tolerance. No "LLM judge" - deterministic comparison for reproducibility.
6 Full Transparency
All code, datasets, questions, and raw results are available in the benchmark/ directory. Run it yourself and verify!
Following Industry Standards
This benchmark follows the methodology established by the TOON benchmark, with enhancements including more questions (300 vs 209), more datasets (20 vs 11), and type-aware validation.
Real-World Impact
What 72% Token Savings Means For You
| Scenario | With JSON | With ISON | Benefit |
|---|---|---|---|
| RAG Context Window | 25 documents | 89 documents | 3.6x more context |
| Monthly API Cost ($1000 budget) | 79M tokens | 282M tokens | 72% cost reduction |
| Response Latency | Baseline | ~40% faster | Fewer tokens to process |
| Fine-tuning Dataset | 10,000 examples | 35,700 examples | Same storage, more data |
The Bottom Line
ISON isn't just faster - it's smarter. By representing data the way AI naturally understands it (in tables), ISON achieves the best balance of efficiency and accuracy. You save tokens without sacrificing quality.
Run the Benchmark Yourself
We believe in transparency. All our benchmark code is open source.
# Clone the repository
git clone https://github.com/maheshvaikri-code/ison.git
cd ison/benchmark
# Install dependencies
pip install tiktoken pyyaml requests ison-py
# Run full benchmark (takes ~45 minutes)
python benchmark_300.py
# Run token-only mode (no API calls, instant)
python benchmark_300.py --no-accuracy
# Run quick test (10 questions only)
python benchmark_300.py --dry-run
Output Files
| File | Description |
|---|---|
benchmark_300_*.log |
Full detailed results with per-question breakdown |
benchmark_300_*.json |
Machine-readable results for programmatic access |
BENCHMARK_300.md |
Human-readable summary report |
Conclusion
ISON: The Most Efficient Format for AI
72% fewer tokens. Same accuracy. 3.6x more efficient.
Less tokens, more context, better AI.