Skip to content

Per-Column Storage Modes Guide

HeliosDB-Lite v3.6+

This guide explains how to use per-column storage modes to optimize storage and query performance for different data characteristics.


Overview

HeliosDB-Lite provides four storage modes that can be applied on a per-column basis:

Mode Syntax Best For Storage Savings
Default (none) General OLTP workloads Baseline
Dictionary STORAGE DICTIONARY Low-cardinality strings 50-95%
Content-Addressed STORAGE CONTENT_ADDRESSED Large values with duplicates ~100% dedup
Columnar STORAGE COLUMNAR Analytics, aggregations 20-50% better

All modes benefit from RocksDB's built-in LZ4 block compression (3.7-7x automatic compression).


Quick Start

CREATE TABLE with Storage Modes

CREATE TABLE events (
    id INT PRIMARY KEY,
    event_type TEXT STORAGE DICTIONARY,           -- ~50 unique event types
    user_country TEXT STORAGE DICTIONARY,         -- ~250 country codes
    payload TEXT STORAGE CONTENT_ADDRESSED,       -- Large JSON payloads
    metrics FLOAT8[] STORAGE COLUMNAR,            -- Analytics aggregations
    timestamp INT8 STORAGE COLUMNAR,              -- Time-series data
    description TEXT                              -- Default storage
);

ALTER TABLE for Existing Data

-- Migrate existing column to dictionary encoding
ALTER TABLE events ALTER COLUMN event_type SET STORAGE DICTIONARY;

-- Migrate to content-addressed storage
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;

-- Migrate to columnar storage
ALTER TABLE events ALTER COLUMN metrics SET STORAGE COLUMNAR;

-- Revert to default
ALTER TABLE events ALTER COLUMN description SET STORAGE DEFAULT;

Important: ALTER TABLE ... SET STORAGE performs an online migration that restructures all existing data without downtime.


Storage Mode Details

1. Default Storage

Standard row-oriented bincode serialization stored directly in RocksDB.

Characteristics: - Values stored inline in row tuples - Best for OLTP point queries - Optimal for high-cardinality data - No additional overhead

When to use: - Primary keys and foreign keys - High-cardinality strings (UUIDs, unique identifiers) - Small numeric values - Data queried by primary key

CREATE TABLE users (
    id INT PRIMARY KEY,           -- Default: primary key lookups
    email TEXT,                   -- Default: unique per user
    created_at TIMESTAMP          -- Default: small fixed-size
);

2. Dictionary Encoding (STORAGE DICTIONARY)

Maps string values to compact integer IDs using a per-column dictionary.

Characteristics: - Value → ID mapping stored separately - Each unique value stored once in dictionary - Row tuples contain only 4-byte IDs - Best for <64K unique values

Storage format:

dict:{table}:{column} → bincode HashMap<String, u32>
Row tuple: Value::DictRef { dict_id: u32 }

When to use: - Status codes (pending, approved, rejected) - Country/region codes (250 unique values) - Category names (product categories, tags) - Enum-like string columns

Storage savings: | Cardinality | Original | Dictionary | Savings | |-------------|----------|------------|---------| | 6 values | 10 bytes avg | 4 bytes | 60% | | 100 values | 15 bytes avg | 4 bytes | 73% | | 1000 values | 20 bytes avg | 4 bytes | 80% |

Example:

CREATE TABLE orders (
    id INT PRIMARY KEY,
    status TEXT STORAGE DICTIONARY,     -- 6 values: pending, processing, shipped, delivered, cancelled, returned
    payment_method TEXT STORAGE DICTIONARY, -- 5 values: card, paypal, bank, crypto, cash
    country TEXT STORAGE DICTIONARY     -- ~200 country codes
);

-- 10M orders with repetitive status values
-- Without dictionary: ~150 MB for status column
-- With dictionary: ~40 MB + tiny dictionary overhead
-- Savings: ~73%


3. Content-Addressed Storage (STORAGE CONTENT_ADDRESSED)

Hash-based deduplication using Blake3 for large values (>1KB).

Characteristics: - Values hashed with Blake3 (fast cryptographic hash) - Identical values stored once regardless of how many rows reference them - Row tuples contain only 32-byte hash references - Automatic: only applies to values >= 1KB

Storage format:

cas:{blake3_hash} → original value bytes
Row tuple: Value::CasRef { hash: [u8; 32] }

When to use: - Document content (many emails use same template) - Profile photos/avatars (many users have default) - JSON payloads (API responses with common structures) - Large text fields with duplicates

Storage savings: | Scenario | Duplicate Rate | Savings | |----------|----------------|---------| | Email templates | 80% same | ~80% | | Default avatars | 95% same | ~95% | | Error messages | 99% same | ~99% |

Example:

CREATE TABLE documents (
    id INT PRIMARY KEY,
    filename TEXT,
    content TEXT STORAGE CONTENT_ADDRESSED,  -- Many docs have same content
    thumbnail BYTEA STORAGE CONTENT_ADDRESSED -- Many have default thumbnail
);

-- Insert 1M documents where 70% are duplicates of 100 templates
INSERT INTO documents VALUES (1, 'welcome.txt', 'Welcome to our service...');
INSERT INTO documents VALUES (2, 'terms.txt', 'Welcome to our service...');  -- Same content, stored once

-- Storage: 100 unique templates * 5KB each = 500KB
-- Without CAS: 1M * 5KB = 5GB
-- Savings: ~99.99%

Note: Small values (<1KB) are stored inline to avoid hash overhead exceeding storage benefit.


4. Columnar Storage (STORAGE COLUMNAR)

Column-grouped batch storage optimized for analytics workloads.

Characteristics: - Values grouped by column in batches of 1024 - Same-type values stored together for better LZ4 compression - Efficient for aggregations (SUM, AVG, COUNT) - Better compression for numeric sequences

Storage format:

col:{table}:{column}:{batch_id} → bincode Vec<Value> (1024 values)
Row tuple: Value::ColumnarRef (placeholder)

When to use: - Time-series data (timestamps, metrics) - Aggregation-heavy analytics - Range scans over numeric columns - Data warehouse fact tables

Performance benefits: | Operation | Row-oriented | Columnar | Improvement | |-----------|--------------|----------|-------------| | SELECT SUM(amount) | Scan all columns | Scan one column | 3-5x faster | | SELECT AVG(price) | Deserialize tuples | Read batch | 2-4x faster | | Range scan | Random I/O | Sequential I/O | 2-3x faster |

Compression benefits: | Data Pattern | Row-oriented LZ4 | Columnar LZ4 | Improvement | |--------------|------------------|--------------|-------------| | Sequential INT8 | 3.96x | 5-8x | +30-50% | | Float sequences | 3.70x | 5-7x | +35-50% | | Timestamps | 3.96x | 6-10x | +50-60% |

Example:

CREATE TABLE metrics (
    id INT PRIMARY KEY,
    sensor_id INT,
    timestamp INT8 STORAGE COLUMNAR,      -- Sequential timestamps
    temperature FLOAT8 STORAGE COLUMNAR,  -- Sensor readings
    pressure FLOAT8 STORAGE COLUMNAR,     -- Similar numeric pattern
    status TEXT STORAGE DICTIONARY        -- Mix with dictionary
);

-- Analytics query benefits from columnar storage
SELECT
    sensor_id,
    AVG(temperature) as avg_temp,
    MAX(pressure) as max_pressure
FROM metrics
WHERE timestamp > 1704067200000
GROUP BY sensor_id;
-- Only reads timestamp, temperature, pressure columns
-- Skips id, sensor_id, status data


Combining Storage Modes

You can use different storage modes for different columns in the same table:

CREATE TABLE e_commerce_events (
    -- Default storage for keys
    id INT PRIMARY KEY,
    user_id INT,

    -- Dictionary for low-cardinality
    event_type TEXT STORAGE DICTIONARY,     -- click, view, purchase, etc.
    country TEXT STORAGE DICTIONARY,        -- ~200 countries
    device_type TEXT STORAGE DICTIONARY,    -- mobile, desktop, tablet

    -- Content-addressed for large duplicates
    user_agent TEXT STORAGE CONTENT_ADDRESSED,  -- Many users same browser
    referrer_url TEXT STORAGE CONTENT_ADDRESSED, -- Common referrers

    -- Columnar for analytics
    timestamp INT8 STORAGE COLUMNAR,        -- Time-series analysis
    session_duration INT4 STORAGE COLUMNAR, -- Aggregations
    page_views INT4 STORAGE COLUMNAR        -- Aggregations
);

Online Migration

When you change a column's storage mode, HeliosDB-Lite performs an online migration:

  1. Scan existing rows one by one
  2. Decode from old format (if any)
  3. Encode to new format
  4. Write back to storage

Migration Performance

Source Mode Target Mode Rows/sec
Default → Dictionary 250K-300K
Default → CAS 200K-250K
Default → Columnar 150K-200K
Dictionary → Default 300K-350K
CAS → Default 200K-250K

Example Migration

-- Check current storage (if supported)
SELECT column_name, storage_mode
FROM information_schema.columns
WHERE table_name = 'events';

-- Migrate status column (1M rows takes ~4 seconds)
ALTER TABLE events ALTER COLUMN status SET STORAGE DICTIONARY;

-- Migrate payload column (1M rows with 10KB payloads takes ~5 seconds)
ALTER TABLE events ALTER COLUMN payload SET STORAGE CONTENT_ADDRESSED;

Best Practices

1. Choose Based on Data Characteristics

Data Pattern Recommended Mode
<1000 unique values DICTIONARY
>50% duplicate large values (>1KB) CONTENT_ADDRESSED
Time-series or aggregation targets COLUMNAR
Everything else DEFAULT

2. Don't Over-Optimize

  • Default is fine for most columns
  • Only use specialized modes when there's clear benefit
  • Profile your actual query patterns

3. Consider Query Patterns

Query Pattern Best Mode
Point lookups by PK DEFAULT
Filter by category DICTIONARY
Full-text or document search CONTENT_ADDRESSED
Aggregations (SUM, AVG, COUNT) COLUMNAR
Range scans COLUMNAR

4. Monitor Storage Usage

-- Check table sizes (when available)
SELECT
    table_name,
    pg_total_relation_size(table_name) as total_size
FROM information_schema.tables
WHERE table_schema = 'public';

Storage Mode Reference

SQL Syntax

CREATE TABLE:

CREATE TABLE table_name (
    column_name TYPE STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR},
    ...
);

ALTER TABLE:

ALTER TABLE table_name
ALTER COLUMN column_name
SET STORAGE {DEFAULT | DICTIONARY | CONTENT_ADDRESSED | COLUMNAR};

Internal Value Types

pub enum Value {
    // ... standard types ...

    /// Dictionary-encoded string reference
    DictRef { dict_id: u32 },

    /// Content-addressed hash reference
    CasRef { hash: [u8; 32] },

    /// Placeholder for columnar storage
    ColumnarRef,
}

Key Formats

Mode Key Pattern Value
Dictionary dict:{table}:{column} Serialized dictionary
CAS cas:{blake3_hash} Original bytes
Columnar col:{table}:{column}:{batch_id} Batch of 1024 values

Troubleshooting

Dictionary Too Large

If you see warnings about dictionary size exceeding 64K entries:

-- Check cardinality before using dictionary
SELECT COUNT(DISTINCT status) FROM orders;

-- If >64K unique values, use DEFAULT instead
ALTER TABLE orders ALTER COLUMN status SET STORAGE DEFAULT;

CAS Not Deduplicating

Content-addressed storage only deduplicates values >= 1KB:

-- Check average value size
SELECT AVG(LENGTH(content)) FROM documents;

-- If <1KB, CAS won't help - use DEFAULT
ALTER TABLE documents ALTER COLUMN content SET STORAGE DEFAULT;

Slow Aggregations

If aggregations are slow on DEFAULT storage, consider COLUMNAR:

-- Profile query
EXPLAIN ANALYZE SELECT SUM(amount) FROM transactions;

-- If scanning many columns, switch to columnar
ALTER TABLE transactions ALTER COLUMN amount SET STORAGE COLUMNAR;

Performance Benchmarks

10GB Dataset Results

Storage Mode Insert Rate Storage Size Compression
Default + LZ4 700K rows/sec 2.7 GB 3.96x
Dictionary + LZ4 650K rows/sec 1.2 GB 8.9x
CAS (70% dupes) 550K rows/sec 0.8 GB 13.4x
Columnar + LZ4 600K rows/sec 2.1 GB 5.1x

Query Performance

Query Type Default Columnar Improvement
Point lookup 0.1ms 0.3ms -3x (worse)
Full scan 850ms 320ms 2.6x better
SUM aggregation 450ms 120ms 3.7x better
AVG aggregation 480ms 130ms 3.7x better


Last Updated: 2026-01-17 Version: HeliosDB-Lite v3.6+