Per-Column Storage Modes

HeliosDB provides four storage modes that can be assigned independently to each column in a table. This per-column granularity — rather than per-table — lets you optimize the physical layout for the exact access pattern each column experiences. Mix row-oriented, columnar, dictionary-encoded, and content-addressed storage within a single table.

  • Default (row-oriented) — The standard row store, optimal for OLTP point lookups and transactional updates. Each row is stored as a contiguous tuple. This is the default when no STORAGE clause is specified.
  • Columnar (batched analytics) — Values are stored in 1,024-row batches organized by column, enabling sequential reads and SIMD-accelerated aggregation. Ideal for columns frequently used in GROUP BY, SUM, AVG, and full-table scans.
  • Dictionary (low-cardinality) — Values are replaced by compact integer codes that reference a shared dictionary. Best for columns with few distinct values such as status, category, or country — delivers both compression and fast equality predicates.
  • ContentAddressed (deduplication) — Values are stored by content hash and deduplicated automatically. Ideal for large text payloads, JSON documents, or any column where many rows share identical values.

Use ALTER TABLE ... ALTER COLUMN ... SET STORAGE to migrate columns between storage modes at any time without rewriting the entire table. The EXPLAIN (STORAGE) command verifies the current layout.

SQL
-- Per-column storage modes in CREATE TABLE
CREATE TABLE events (
    id         INT PRIMARY KEY,              -- Default (row store)
    user_id    INT,                          -- Default (row store)
    category   TEXT STORAGE DICTIONARY,      -- Dictionary encoding
    amount     FLOAT8 STORAGE COLUMNAR,      -- Columnar batches
    timestamp  INT8 STORAGE COLUMNAR,        -- Columnar batches
    payload    TEXT STORAGE CONTENT_ADDRESSED -- Deduplication
);

-- Migrate columns between storage modes
ALTER TABLE events
  ALTER COLUMN user_id SET STORAGE COLUMNAR;

-- Verify storage layout
EXPLAIN (STORAGE) SELECT * FROM events;

EXPLAIN Workload Classification

Every EXPLAIN ANALYZE output now includes a Workload Classification section at the end of the plan. The ML-driven WorkloadClassifier scores every query on the OLTP/OLAP/HTAP spectrum based on access patterns, aggregation ratio, selectivity, join complexity, and expected latency. This classification drives automatic store routing — the query engine selects row store or column store without manual hints.

  • Type — The classified workload type: OLTP, OLAP, or HTAP (mixed). Determined by the combined OLTP and OLAP scores.
  • Confidence — How certain the classifier is about the workload type, expressed as a percentage. Higher confidence means the query clearly fits one pattern.
  • OLTP Score — A 0.0–1.0 score reflecting transactional characteristics: point lookups, low selectivity, index usage, and small result sets.
  • OLAP Score — A 0.0–1.0 score reflecting analytical characteristics: full-table scans, aggregations, GROUP BY, window functions, and large result sets.
  • Recommended Store — The storage format the classifier recommends for this query: Row Store, Column Store, or Mixed. The StoreRouter uses this recommendation to direct the scan operator.
EXPLAIN Output
helios> EXPLAIN ANALYZE SELECT * FROM orders WHERE id = 42;

EXPLAIN ANALYZE
...
Workload Classification
  Type             : OLTP
  Confidence       : 90%
  OLTP Score       : 0.85
  OLAP Score       : 0.12
  Recommended Store: Row Store

helios> EXPLAIN ANALYZE SELECT customer, SUM(amount)
        FROM orders GROUP BY customer;

...
Workload Classification
  Type             : OLAP
  Confidence       : 90%
  OLTP Score       : 0.10
  OLAP Score       : 0.80
  Recommended Store: Column Store

Columnar Scan Detection

When all columns in a scanned table use STORAGE COLUMNAR, the EXPLAIN output shows “Columnar Scan” instead of the standard “Seq Scan”. This confirms that the query engine is using the optimized batch columnar scan operator — the ColumnarScanOperator — which reads column batches sequentially for maximum analytics throughput.

  • Automatic detection — No hints or configuration required. The planner inspects column storage modes and selects the columnar scan operator when all referenced columns are columnar.
  • Batch processing — The columnar scan reads data in 1,024-row batches, enabling SIMD-accelerated predicate evaluation and vectorized aggregation.
  • I/O efficiency — Sequential column reads minimize random I/O. Only the columns referenced in the query are read from storage, reducing bandwidth by up to 90% for wide tables.
  • Mixed-mode fallback — If any referenced column uses a non-columnar storage mode, the standard scan operator is used with transparent ColumnarRef resolution. The query always returns correct results regardless of storage layout.
SQL
-- All columns use STORAGE COLUMNAR
CREATE TABLE metrics (
    ts    INT8 STORAGE COLUMNAR,
    value FLOAT8 STORAGE COLUMNAR
);

helios> EXPLAIN ANALYZE SELECT * FROM metrics;
-- Shows: → Columnar Scan metrics
-- Instead of: → Scan metrics

-- The ColumnarScanOperator reads
-- column batches sequentially for
-- maximum analytics throughput.
--
-- Aggregations on columnar tables
-- are 10-50x faster than row scans:

helios> EXPLAIN ANALYZE
        SELECT AVG(value) FROM metrics
        WHERE ts > 1704067200;
-- → Columnar Scan metrics
--      Filter: ts > 1704067200
--      Rows: 50000, Time: 1.2ms

EXPLAIN (STORAGE) Format

The EXPLAIN (STORAGE) option shows detailed per-column storage information for every table referenced in the query. This diagnostic view reveals the physical storage layout, including storage mode, bloom filter status, zone map coverage, compression algorithms, and columnar batch statistics. Use it to verify that your storage configuration matches your workload expectations.

  • Per-column storage mode — Shows whether each column is stored as Default, Columnar, Dictionary, or ContentAddressed.
  • Bloom filter status — Indicates whether a bloom filter is active on the column and its false positive rate. Bloom filters accelerate equality predicates by skipping irrelevant pages.
  • Zone map coverage — Shows min/max metadata per column segment. Zone maps enable range predicate pruning without reading actual data.
  • Compression algorithms — Displays the compression algorithm in use for each column (ZSTD, LZ4, delta, FSST, etc.) and the achieved compression ratio.
  • Columnar batch statistics — For columnar columns, shows the number of batches, rows per batch, and total batch size in bytes.
SQL
EXPLAIN (STORAGE) SELECT * FROM events;

-- Shows per-column storage mode breakdown
-- Bloom filter status per column
-- Zone map coverage
-- Compression algorithms in use
-- Columnar batch statistics

-- Example output:
Storage Layout: events
  id        : Default    | bloom: off  | zone_map: yes
  user_id   : Columnar   | bloom: on   | zone_map: yes
  category  : Dictionary | bloom: off  | zone_map: no
  amount    : Columnar   | bloom: off  | zone_map: yes
  timestamp : Columnar   | bloom: off  | zone_map: yes
  payload   : ContentAddr| bloom: off  | zone_map: no

Compression:
  amount    : delta + bit-packing  (12:1)
  timestamp : delta-of-delta       (20:1)
  category  : dictionary           (32:1)

Columnar Batches:
  user_id   : 5 batches, 1024 rows/batch
  amount    : 5 batches, 1024 rows/batch
  timestamp : 5 batches, 1024 rows/batch

HTAP Layout Statistics

Two system views provide real-time HTAP monitoring. The pg_htap_layout_stats view shows per-table storage layout information, giving you a high-level overview of how data is distributed between the row store and column store across all tables in the database.

Column Type Description
table_name TEXT Table name
row_store_size INT8 Estimated row store size in bytes
column_store_size INT8 Estimated column store size in bytes
row_store_rows INT8 Number of rows in row store
column_store_rows INT8 Number of rows in columnar batches
sync_status TEXT row_only, columnar_only, dual_format, or mixed
implemented BOOLEAN Whether HTAP is active
SQL
SELECT * FROM pg_htap_layout_stats;

-- Example output:
--  table_name | row_store_size | column_store_size
-- -----------+----------------+------------------
--  orders     |          64000 |             48000
--  users      |          12800 |                 0
--
--  row_store_rows | column_store_rows | sync_status
-- ---------------+-------------------+------------
--           1000 |              1000 | dual_format
--            200 |                 0 | row_only
--
--  implemented
-- -------------
--  true
--  true

-- Filter for dual-format tables only
SELECT table_name,
       row_store_rows,
       column_store_rows,
       sync_status
FROM pg_htap_layout_stats
WHERE sync_status = 'dual_format';

-- Monitor storage distribution
SELECT table_name,
       row_store_size AS row_bytes,
       column_store_size AS col_bytes,
       ROUND(column_store_size::float
         / (row_store_size + column_store_size)
         * 100, 1) AS columnar_pct
FROM pg_htap_layout_stats
WHERE implemented = true;

Workload Patterns View

The helios_workload_patterns system view exposes per-column access patterns and storage recommendations based on observed query behavior. This view is populated by the ML-driven WorkloadClassifier and updated continuously as queries execute. Use it to understand how each column is accessed and whether the current storage mode is optimal.

Column Type Description
table_name TEXT Table name
column_name TEXT Column name
read_frequency FLOAT8 Estimated read frequency (0.0–1.0)
write_frequency FLOAT8 Estimated write frequency (0.0–1.0)
delete_frequency FLOAT8 Delete frequency
equality_predicates INT8 Count of equality predicate usage
range_predicates INT8 Count of range predicate usage
in_list_predicates INT8 Count of IN list usage
join_count INT8 Count of join usage
data_temperature TEXT hot, warm, or cold
cardinality INT8 Estimated distinct values
avg_selectivity FLOAT8 Average filter selectivity
filter_recommendation TEXT Recommended filter type
SQL
SELECT table_name, column_name,
       data_temperature,
       filter_recommendation
FROM helios_workload_patterns
WHERE table_name = 'orders';

-- Example output:
--  table_name | column_name
-- -----------+------------
--  orders     | id
--  orders     | customer
--  orders     | amount
--  orders     | created_at
--
--  data_temperature | filter_recommendation
-- -----------------+----------------------
--  hot              | bloom_filter
--  hot              | dictionary_index
--  warm             | zone_map
--  warm             | zone_map

-- Find hot columns that might benefit
-- from storage mode changes
SELECT table_name, column_name,
       read_frequency,
       equality_predicates,
       range_predicates
FROM helios_workload_patterns
WHERE data_temperature = 'hot'
ORDER BY read_frequency DESC;

Resource Isolation

The HTAP resource manager maintains separate CPU, memory, and I/O budgets for OLTP and OLAP workloads. This isolation prevents analytical queries — which may scan millions of rows — from consuming resources needed by latency-sensitive transactional operations. The default allocation is 60% OLTP / 40% OLAP, fully configurable via heliosdb.toml.

  • CPU isolation — OLTP and OLAP workloads are scheduled on separate thread pools with configurable CPU ratios. Analytical queries cannot starve transactional operations of compute time.
  • Memory budgets — Each workload type has a dedicated memory allocation. The buffer pool, sort buffers, and hash tables for OLAP queries draw from the OLAP budget, leaving the OLTP budget untouched.
  • Connection limits — Separate connection pools for OLTP (default: 1,000) and OLAP (default: 16) prevent connection exhaustion. A long-running analytical query cannot block incoming transactional connections.
  • I/O prioritization — OLTP reads and writes are prioritized at the I/O scheduler level. OLAP full-table scans are submitted at lower priority to maintain transactional responsiveness.
  • Automatic classification — The WorkloadClassifier assigns each query to the OLTP or OLAP resource pool automatically. No application-level changes required — just configure the ratios and let the system manage isolation.
heliosdb.toml
# heliosdb.toml - Resource isolation settings
[htap]
oltp_cpu_ratio = 0.6      # 60% CPU for OLTP
olap_cpu_ratio = 0.4      # 40% CPU for OLAP
oltp_memory_ratio = 0.6   # 60% memory for OLTP
olap_memory_ratio = 0.4   # 40% memory for OLAP
oltp_max_connections = 1000
olap_max_connections = 16

# I/O priority settings
[htap.io]
oltp_io_priority = "high"
olap_io_priority = "normal"
olap_scan_throttle = false

# Workload classifier tuning
[htap.classifier]
enabled = true
oltp_threshold = 0.6
olap_threshold = 0.6
# Queries scoring below both thresholds
# are classified as HTAP (mixed) and
# can draw from either resource pool.

# Example resource allocation on a
# 16-core, 64GB server:
#   OLTP: 10 cores, 38GB RAM, 1000 conns
#   OLAP:  6 cores, 26GB RAM,   16 conns

Available in HeliosDB Lite and Full

HeliosCore powers the storage layer in both HeliosDB Lite and HeliosDB Full editions. HTAP dual-format storage, Direct I/O, adaptive compression, and ML-powered optimization — start building with the next-generation storage engine today.