Overview
ISONL is to ISON what JSONL is to JSON - a line-based streaming format where each line is a self-contained record. ISONL maintains ISON's token efficiency while enabling:
Streaming Processing
Process records line-by-line without loading entire files. Ideal for large datasets and real-time pipelines.
LLM Fine-Tuning
Store training examples in the token-efficient format that LLMs already understand from training data.
Append-Only Logs
Add new records without rewriting files. Perfect for event streams, audit logs, and incremental exports.
Cross-Record References
Unlike JSONL, ISONL supports native references between records with the :ID syntax.
Token Efficiency at Scale
Format
Each ISONL line has three pipe-separated sections:
kind.name|field1 field2 field3|value1 value2 value3
| Section | Description | Example |
|---|---|---|
| Header | Block kind and name | table.users |
| Fields | Space-separated field names | id name email |
| Values | Space-separated values | 1 Alice alice@test.com |
Complete Example
# User records
table.users|id name email active|1 Alice alice@example.com true
table.users|id name email active|2 Bob bob@example.com false
table.users|id name email active|3 "Charlie Brown" charlie@example.com true
# Order records with references
table.orders|id user_id total|101 :1 99.99
table.orders|id user_id total|102 :2 149.50
Data Types
ISONL uses automatic type inference (same as ISON):
| Type | Recognition | Example |
|---|---|---|
| Boolean | true / false | true |
| Null | null | null |
| Integer | Digits with optional - | 42, -7 |
| Float | Decimal number | 3.14, -0.5 |
| Reference | Starts with : | :10, :user:101 |
| String | Everything else | Alice, "New York" |
ISONL vs JSONL
JSONL (67 tokens)
{"id": 1, "name": "Alice", "email": "alice@example.com", "active": true}
{"id": 2, "name": "Bob", "email": "bob@example.com", "active": false}
{"id": 3, "name": "Charlie", "email": "charlie@example.com", "active": true}
ISONL (40 tokens) 30-70% fewer tokens
table.users|id name email active|1 Alice alice@example.com true
table.users|id name email active|2 Bob bob@example.com false
table.users|id name email active|3 Charlie charlie@example.com true
Token Savings at Scale
| Records | JSONL Tokens | ISONL Tokens | Savings |
|---|---|---|---|
| 10 | ~220 | ~140 | 36% |
| 100 | ~2,200 | ~1,300 | 41% |
| 1,000 | ~22,000 | ~12,000 | 45% |
| 10,000 | ~220,000 | ~115,000 | 48% |
Feature Comparison
| Feature | JSONL | ISONL |
|---|---|---|
| Self-describing lines | Yes | Yes (header per line) |
| Type inference | JSON types | ISON types |
| Cross-record references | No | Yes (:ID syntax) |
| Block types | No | Yes (table., object.) |
| Comments | No | Yes (#) |
| Token efficiency | Baseline | 40-48% better |
Use Cases
Fine-Tuning Datasets
Store SFT, DPO, and RLHF training data. 30-70% smaller datasets = faster loading.
table.sft|instruction response|"Explain AI" "Detailed answer..."
table.sft|instruction response|"Write code" "def hello(): ..."
Event Streams
Real-time event processing with constant memory usage.
table.events|ts type user|10:30:00 click :1
table.events|ts type user|10:30:05 pageview :1
RAG Chunk Storage
Store document chunks for retrieval with references to embeddings.
table.chunks|id doc content|C1 :D1 "First paragraph..."
table.chunks|id doc content|C2 :D1 "Second paragraph..."
Agent Message Logs
Log multi-agent communications for analysis and replay.
table.msgs|ts from to content|10:00 planner coder "Implement API"
table.msgs|ts from to content|10:05 coder planner "Need clarification"
Python API
loads_isonl(text: str) -> Document
Parse ISONL text into a Document object:
from ison_parser import loads_isonl
isonl = """
table.users|id name email|1 Alice alice@test.com
table.users|id name email|2 Bob bob@test.com
"""
doc = loads_isonl(isonl)
for block in doc.blocks:
for row in block.rows:
print(f"{row['name']}: {row['email']}")
dumps_isonl(doc: Document) -> str
Serialize a Document to ISONL format:
from ison_parser import loads, dumps_isonl
# Parse ISON
doc = loads("""
table.users
id name
1 Alice
2 Bob
""")
# Convert to ISONL
isonl = dumps_isonl(doc)
print(isonl)
# table.users|id name|1 Alice
# table.users|id name|2 Bob
ison_to_isonl(ison_text: str) -> str
Direct conversion from ISON to ISONL:
from ison_parser import ison_to_isonl
ison = """
table.orders
id customer total
1 Alice 99.99
2 Bob 149.50
"""
isonl = ison_to_isonl(ison)
print(isonl)
isonl_to_ison(isonl_text: str) -> str
Convert ISONL back to multi-line ISON:
from ison_parser import isonl_to_ison
isonl = """
table.users|id name|1 Alice
table.users|id name|2 Bob
"""
ison = isonl_to_ison(isonl)
print(ison)
# table.users
# id name
# 1 Alice
# 2 Bob
JavaScript API
loadsISONL(text)
Parse ISONL text into a Document object:
import { loadsISONL } from './ison-parser.js';
const isonl = `
table.users|id name email|1 Alice alice@test.com
table.users|id name email|2 Bob bob@test.com
`;
const doc = loadsISONL(isonl);
doc.blocks.forEach(block => {
block.rows.forEach(row => {
console.log(`${row.name}: ${row.email}`);
});
});
dumpsISONL(doc)
Serialize a Document to ISONL format:
import { loads, dumpsISONL } from './ison-parser.js';
const doc = loads(`
table.users
id name
1 Alice
2 Bob
`);
const isonl = dumpsISONL(doc);
console.log(isonl);
isonToISONL(isonText) / isonlToISON(isonlText)
Direct format conversion:
import { isonToISONL, isonlToISON } from './ison-parser.js';
// ISON -> ISONL
const isonl = isonToISONL(isonText);
// ISONL -> ISON
const ison = isonlToISON(isonlText);
Streaming
ISONL's line-based format enables constant-memory streaming:
Python Generator
from ison_parser import isonl_stream
# Stream from file (constant memory)
with open("large_dataset.isonl", "r") as f:
for record in isonl_stream(f):
print(f"{record.kind}.{record.name}: {record.values}")
# Process one record at a time
JavaScript Generator
import { isonlStream } from './ison-parser.js';
// Stream from lines array or generator
const lines = fs.readFileSync('data.isonl', 'utf8').split('\n');
for (const record of isonlStream(lines)) {
console.log(`${record.kind}.${record.name}:`, record.values);
}
Memory Comparison
| Approach | Memory Usage | Use Case |
|---|---|---|
loads_isonl() |
O(n) - loads entire file | Small files, need all data |
isonl_stream() |
O(1) - one record at a time | Large files, ETL, filtering |
Conversion
ISON to ISONL
Each row in an ISON block becomes one ISONL line:
ISON (multi-line)
table.users
id name email
1 Alice alice@example.com
2 Bob bob@example.com
ISONL (one line per record)
table.users|id name email|1 Alice alice@example.com
table.users|id name email|2 Bob bob@example.com
ISONL to ISON
Lines with the same header are grouped into a single ISON block:
# Input ISONL (mixed blocks)
table.users|id name|1 Alice
object.config|timeout debug|30 true
table.users|id name|2 Bob
# Output ISON (grouped)
table.users
id name
1 Alice
2 Bob
object.config
timeout debug
30 true
Examples
Fine-Tuning Dataset (SFT)
# Supervised Fine-Tuning format
table.sft|instruction input output|"Summarize this text" "Long article about AI..." "AI advances rapidly..."
table.sft|instruction input output|"Translate to Spanish" "Hello, how are you?" "Hola, como estas?"
table.sft|instruction input output|"Extract entities" "Apple announced new iPhone" "[Apple, iPhone]"
Preference Learning (DPO/RLHF)
# Direct Preference Optimization format
table.prefs|prompt chosen rejected|"Explain gravity" "Gravity is a force..." "idk its heavy"
table.prefs|prompt chosen rejected|"Write a poem" "Roses are red..." "poem stuff here"
Event Stream with References
# Users
table.users|id name|U1 Alice
table.users|id name|U2 Bob
# Events referencing users
table.events|ts type user data|2025-01-15T10:30:00Z click :U1 "button=submit"
table.events|ts type user data|2025-01-15T10:30:05Z pageview :U1 "/dashboard"
table.events|ts type user data|2025-01-15T10:30:10Z login :U2 "method=oauth"
Append-Only Export
# Initial export
python -c "from ison_parser import ison_to_isonl; print(ison_to_isonl(open('data.ison').read()))" >> stream.isonl
# Later incremental updates (append only)
python -c "from ison_parser import ison_to_isonl; print(ison_to_isonl(open('new_data.ison').read()))" >> stream.isonl
Ready to Stream?
ISONL brings streaming efficiency to the ISON ecosystem.