Per-Column Storage Modes Guide¶
HeliosDB-Lite v3.6+
This guide explains how to use per-column storage modes to optimize storage and query performance for different data characteristics.
Overview¶
HeliosDB-Lite provides four storage modes that can be applied on a per-column basis:
| Mode | Syntax | Best For | Storage Savings |
|---|---|---|---|
| Default | (none) | General OLTP workloads | Baseline |
| Dictionary | STORAGE DICTIONARY |
Low-cardinality strings | 50-95% |
| Content-Addressed | STORAGE CONTENT_ADDRESSED |
Large values with duplicates | ~100% dedup |
| Columnar | STORAGE COLUMNAR |
Analytics, aggregations | 20-50% better |
All modes benefit from RocksDB's built-in LZ4 block compression (3.7-7x automatic compression).
Quick Start¶
CREATE TABLE with Storage Modes¶
CREATE TABLE events (
id INT PRIMARY KEY,
event_type TEXT STORAGE DICTIONARY, -- ~50 unique event types
user_country TEXT STORAGE DICTIONARY, -- ~250 country codes
payload TEXT STORAGE CONTENT_ADDRESSED, -- Large JSON payloads
metrics FLOAT8[] STORAGE COLUMNAR, -- Analytics aggregations
timestamp INT8 STORAGE COLUMNAR, -- Time-series data
description TEXT -- Default storage
);
ALTER TABLE for Existing Data¶
-- Migrate existing column to dictionary encoding
ALTER TABLE events ALTER COLUMN event_type SET STORAGE DICTIONARY;
-- Migrate to content-addressed storage
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;
-- Migrate to columnar storage
ALTER TABLE events ALTER COLUMN metrics SET STORAGE COLUMNAR;
-- Revert to default
ALTER TABLE events ALTER COLUMN description SET STORAGE DEFAULT;
Important: ALTER TABLE ... SET STORAGE performs an online migration that restructures all existing data without downtime.
Storage Mode Details¶
1. Default Storage¶
Standard row-oriented bincode serialization stored directly in RocksDB.
Characteristics: - Values stored inline in row tuples - Best for OLTP point queries - Optimal for high-cardinality data - No additional overhead
When to use: - Primary keys and foreign keys - High-cardinality strings (UUIDs, unique identifiers) - Small numeric values - Data queried by primary key
CREATE TABLE users (
id INT PRIMARY KEY, -- Default: primary key lookups
email TEXT, -- Default: unique per user
created_at TIMESTAMP -- Default: small fixed-size
);
2. Dictionary Encoding (STORAGE DICTIONARY)¶
Maps string values to compact integer IDs using a per-column dictionary.
Characteristics: - Value → ID mapping stored separately - Each unique value stored once in dictionary - Row tuples contain only 4-byte IDs - Best for <64K unique values
Storage format:
When to use:
- Status codes (pending, approved, rejected)
- Country/region codes (250 unique values)
- Category names (product categories, tags)
- Enum-like string columns
Storage savings: | Cardinality | Original | Dictionary | Savings | |-------------|----------|------------|---------| | 6 values | 10 bytes avg | 4 bytes | 60% | | 100 values | 15 bytes avg | 4 bytes | 73% | | 1000 values | 20 bytes avg | 4 bytes | 80% |
Example:
CREATE TABLE orders (
id INT PRIMARY KEY,
status TEXT STORAGE DICTIONARY, -- 6 values: pending, processing, shipped, delivered, cancelled, returned
payment_method TEXT STORAGE DICTIONARY, -- 5 values: card, paypal, bank, crypto, cash
country TEXT STORAGE DICTIONARY -- ~200 country codes
);
-- 10M orders with repetitive status values
-- Without dictionary: ~150 MB for status column
-- With dictionary: ~40 MB + tiny dictionary overhead
-- Savings: ~73%
3. Content-Addressed Storage (STORAGE CONTENT_ADDRESSED)¶
Hash-based deduplication using Blake3 for large values (>1KB).
Characteristics: - Values hashed with Blake3 (fast cryptographic hash) - Identical values stored once regardless of how many rows reference them - Row tuples contain only 32-byte hash references - Automatic: only applies to values >= 1KB
Storage format:
When to use: - Document content (many emails use same template) - Profile photos/avatars (many users have default) - JSON payloads (API responses with common structures) - Large text fields with duplicates
Storage savings: | Scenario | Duplicate Rate | Savings | |----------|----------------|---------| | Email templates | 80% same | ~80% | | Default avatars | 95% same | ~95% | | Error messages | 99% same | ~99% |
Example:
CREATE TABLE documents (
id INT PRIMARY KEY,
filename TEXT,
content TEXT STORAGE CONTENT_ADDRESSED, -- Many docs have same content
thumbnail BYTEA STORAGE CONTENT_ADDRESSED -- Many have default thumbnail
);
-- Insert 1M documents where 70% are duplicates of 100 templates
INSERT INTO documents VALUES (1, 'welcome.txt', 'Welcome to our service...');
INSERT INTO documents VALUES (2, 'terms.txt', 'Welcome to our service...'); -- Same content, stored once
-- Storage: 100 unique templates * 5KB each = 500KB
-- Without CAS: 1M * 5KB = 5GB
-- Savings: ~99.99%
Note: Small values (<1KB) are stored inline to avoid hash overhead exceeding storage benefit.
4. Columnar Storage (STORAGE COLUMNAR)¶
Column-grouped batch storage optimized for analytics workloads.
Characteristics: - Values grouped by column in batches of 1024 - Same-type values stored together for better LZ4 compression - Efficient for aggregations (SUM, AVG, COUNT) - Better compression for numeric sequences
Storage format:
col:{table}:{column}:{batch_id} → bincode Vec<Value> (1024 values)
Row tuple: Value::ColumnarRef (placeholder)
When to use: - Time-series data (timestamps, metrics) - Aggregation-heavy analytics - Range scans over numeric columns - Data warehouse fact tables
Performance benefits: | Operation | Row-oriented | Columnar | Improvement | |-----------|--------------|----------|-------------| | SELECT SUM(amount) | Scan all columns | Scan one column | 3-5x faster | | SELECT AVG(price) | Deserialize tuples | Read batch | 2-4x faster | | Range scan | Random I/O | Sequential I/O | 2-3x faster |
Compression benefits: | Data Pattern | Row-oriented LZ4 | Columnar LZ4 | Improvement | |--------------|------------------|--------------|-------------| | Sequential INT8 | 3.96x | 5-8x | +30-50% | | Float sequences | 3.70x | 5-7x | +35-50% | | Timestamps | 3.96x | 6-10x | +50-60% |
Example:
CREATE TABLE metrics (
id INT PRIMARY KEY,
sensor_id INT,
timestamp INT8 STORAGE COLUMNAR, -- Sequential timestamps
temperature FLOAT8 STORAGE COLUMNAR, -- Sensor readings
pressure FLOAT8 STORAGE COLUMNAR, -- Similar numeric pattern
status TEXT STORAGE DICTIONARY -- Mix with dictionary
);
-- Analytics query benefits from columnar storage
SELECT
sensor_id,
AVG(temperature) as avg_temp,
MAX(pressure) as max_pressure
FROM metrics
WHERE timestamp > 1704067200000
GROUP BY sensor_id;
-- Only reads timestamp, temperature, pressure columns
-- Skips id, sensor_id, status data
Combining Storage Modes¶
You can use different storage modes for different columns in the same table:
CREATE TABLE e_commerce_events (
-- Default storage for keys
id INT PRIMARY KEY,
user_id INT,
-- Dictionary for low-cardinality
event_type TEXT STORAGE DICTIONARY, -- click, view, purchase, etc.
country TEXT STORAGE DICTIONARY, -- ~200 countries
device_type TEXT STORAGE DICTIONARY, -- mobile, desktop, tablet
-- Content-addressed for large duplicates
user_agent TEXT STORAGE CONTENT_ADDRESSED, -- Many users same browser
referrer_url TEXT STORAGE CONTENT_ADDRESSED, -- Common referrers
-- Columnar for analytics
timestamp INT8 STORAGE COLUMNAR, -- Time-series analysis
session_duration INT4 STORAGE COLUMNAR, -- Aggregations
page_views INT4 STORAGE COLUMNAR -- Aggregations
);
Online Migration¶
When you change a column's storage mode, HeliosDB-Lite performs an online migration:
- Scan existing rows one by one
- Decode from old format (if any)
- Encode to new format
- Write back to storage
Migration Performance¶
| Source Mode | Target Mode | Rows/sec |
|---|---|---|
| Default → Dictionary | 250K-300K | |
| Default → CAS | 200K-250K | |
| Default → Columnar | 150K-200K | |
| Dictionary → Default | 300K-350K | |
| CAS → Default | 200K-250K |
Example Migration¶
-- Check current storage (if supported)
SELECT column_name, storage_mode
FROM information_schema.columns
WHERE table_name = 'events';
-- Migrate status column (1M rows takes ~4 seconds)
ALTER TABLE events ALTER COLUMN status SET STORAGE DICTIONARY;
-- Migrate payload column (1M rows with 10KB payloads takes ~5 seconds)
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;
Best Practices¶
1. Choose Based on Data Characteristics¶
| Data Pattern | Recommended Mode |
|---|---|
| <1000 unique values | DICTIONARY |
| >50% duplicate large values (>1KB) | CONTENT_ADDRESSED |
| Time-series or aggregation targets | COLUMNAR |
| Everything else | DEFAULT |
2. Don't Over-Optimize¶
- Default is fine for most columns
- Only use specialized modes when there's clear benefit
- Profile your actual query patterns
3. Consider Query Patterns¶
| Query Pattern | Best Mode |
|---|---|
| Point lookups by PK | DEFAULT |
| Filter by category | DICTIONARY |
| Full-text or document search | CONTENT_ADDRESSED |
| Aggregations (SUM, AVG, COUNT) | COLUMNAR |
| Range scans | COLUMNAR |
4. Monitor Storage Usage¶
-- Check table sizes (when available)
SELECT
table_name,
pg_total_relation_size(table_name) as total_size
FROM information_schema.tables
WHERE table_schema = 'public';
Storage Mode Reference¶
SQL Syntax¶
CREATE TABLE:
CREATE TABLE table_name (
column_name TYPE STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR},
...
);
ALTER TABLE:
ALTER TABLE table_name
ALTER COLUMN column_name
SET STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR};
Internal Value Types¶
pub enum Value {
// ... standard types ...
/// Dictionary-encoded string reference
DictRef { dict_id: u32 },
/// Content-addressed hash reference
CasRef { hash: [u8; 32] },
/// Placeholder for columnar storage
ColumnarRef,
}
Key Formats¶
| Mode | Key Pattern | Value |
|---|---|---|
| Dictionary | dict:{table}:{column} |
Serialized dictionary |
| CAS | cas:{blake3_hash} |
Original bytes |
| Columnar | col:{table}:{column}:{batch_id} |
Batch of 1024 values |
Troubleshooting¶
Dictionary Too Large¶
If you see warnings about dictionary size exceeding 64K entries:
-- Check cardinality before using dictionary
SELECT COUNT(DISTINCT status) FROM orders;
-- If >64K unique values, use DEFAULT instead
ALTER TABLE orders ALTER COLUMN status SET STORAGE DEFAULT;
CAS Not Deduplicating¶
Content-addressed storage only deduplicates values >= 1KB:
-- Check average value size
SELECT AVG(LENGTH(content)) FROM documents;
-- If <1KB, CAS won't help - use DEFAULT
ALTER TABLE documents ALTER COLUMN content SET STORAGE DEFAULT;
Slow Aggregations¶
If aggregations are slow on DEFAULT storage, consider COLUMNAR:
-- Profile query
EXPLAIN ANALYZE SELECT SUM(amount) FROM transactions;
-- If scanning many columns, switch to columnar
ALTER TABLE transactions ALTER COLUMN amount SET STORAGE COLUMNAR;
Performance Benchmarks¶
10GB Dataset Results¶
| Storage Mode | Insert Rate | Storage Size | Compression |
|---|---|---|---|
| Default + LZ4 | 700K rows/sec | 2.7 GB | 3.96x |
| Dictionary + LZ4 | 650K rows/sec | 1.2 GB | 8.9x |
| CAS (70% dupes) | 550K rows/sec | 0.8 GB | 13.4x |
| Columnar + LZ4 | 600K rows/sec | 2.1 GB | 5.1x |
Query Performance¶
| Query Type | Default | Columnar | Improvement |
|---|---|---|---|
| Point lookup | 0.1ms | 0.3ms | -3x (worse) |
| Full scan | 850ms | 320ms | 2.6x better |
| SUM aggregation | 450ms | 120ms | 3.7x better |
| AVG aggregation | 480ms | 130ms | 3.7x better |
Related Documentation¶
Last Updated: 2026-01-17 Version: HeliosDB-Lite v3.6+