Per-column storage modes, workload classification, system views, and monitoring.
Complete SQL reference for HeliosDB's hybrid transactional/analytical processing.
HeliosDB provides four storage modes that can be assigned independently to each column in a table. This per-column granularity — rather than per-table — lets you optimize the physical layout for the exact access pattern each column experiences. Mix row-oriented, columnar, dictionary-encoded, and content-addressed storage within a single table.
Use ALTER TABLE ... ALTER COLUMN ... SET STORAGE to migrate columns between storage modes at any time without rewriting the entire table. The EXPLAIN (STORAGE) command verifies the current layout.
-- Per-column storage modes in CREATE TABLE
CREATE TABLE events (
id INT PRIMARY KEY, -- Default (row store)
user_id INT, -- Default (row store)
category TEXT STORAGE DICTIONARY, -- Dictionary encoding
amount FLOAT8 STORAGE COLUMNAR, -- Columnar batches
timestamp INT8 STORAGE COLUMNAR, -- Columnar batches
payload TEXT STORAGE CONTENT_ADDRESSED -- Deduplication
);
-- Migrate columns between storage modes
ALTER TABLE events
ALTER COLUMN user_id SET STORAGE COLUMNAR;
-- Verify storage layout
EXPLAIN (STORAGE) SELECT * FROM events;
Every EXPLAIN ANALYZE output now includes a Workload Classification section at the end of the plan. The ML-driven WorkloadClassifier scores every query on the OLTP/OLAP/HTAP spectrum based on access patterns, aggregation ratio, selectivity, join complexity, and expected latency. This classification drives automatic store routing — the query engine selects row store or column store without manual hints.
helios> EXPLAIN ANALYZE SELECT * FROM orders WHERE id = 42;
EXPLAIN ANALYZE
...
Workload Classification
Type : OLTP
Confidence : 90%
OLTP Score : 0.85
OLAP Score : 0.12
Recommended Store: Row Store
helios> EXPLAIN ANALYZE SELECT customer, SUM(amount)
FROM orders GROUP BY customer;
...
Workload Classification
Type : OLAP
Confidence : 90%
OLTP Score : 0.10
OLAP Score : 0.80
Recommended Store: Column Store
When all columns in a scanned table use STORAGE COLUMNAR, the EXPLAIN output shows “Columnar Scan” instead of the standard “Seq Scan”. This confirms that the query engine is using the optimized batch columnar scan operator — the ColumnarScanOperator — which reads column batches sequentially for maximum analytics throughput.
-- All columns use STORAGE COLUMNAR
CREATE TABLE metrics (
ts INT8 STORAGE COLUMNAR,
value FLOAT8 STORAGE COLUMNAR
);
helios> EXPLAIN ANALYZE SELECT * FROM metrics;
-- Shows: → Columnar Scan metrics
-- Instead of: → Scan metrics
-- The ColumnarScanOperator reads
-- column batches sequentially for
-- maximum analytics throughput.
--
-- Aggregations on columnar tables
-- are 10-50x faster than row scans:
helios> EXPLAIN ANALYZE
SELECT AVG(value) FROM metrics
WHERE ts > 1704067200;
-- → Columnar Scan metrics
-- Filter: ts > 1704067200
-- Rows: 50000, Time: 1.2ms
The EXPLAIN (STORAGE) option shows detailed per-column storage information for every table referenced in the query. This diagnostic view reveals the physical storage layout, including storage mode, bloom filter status, zone map coverage, compression algorithms, and columnar batch statistics. Use it to verify that your storage configuration matches your workload expectations.
EXPLAIN (STORAGE) SELECT * FROM events;
-- Shows per-column storage mode breakdown
-- Bloom filter status per column
-- Zone map coverage
-- Compression algorithms in use
-- Columnar batch statistics
-- Example output:
Storage Layout: events
id : Default | bloom: off | zone_map: yes
user_id : Columnar | bloom: on | zone_map: yes
category : Dictionary | bloom: off | zone_map: no
amount : Columnar | bloom: off | zone_map: yes
timestamp : Columnar | bloom: off | zone_map: yes
payload : ContentAddr| bloom: off | zone_map: no
Compression:
amount : delta + bit-packing (12:1)
timestamp : delta-of-delta (20:1)
category : dictionary (32:1)
Columnar Batches:
user_id : 5 batches, 1024 rows/batch
amount : 5 batches, 1024 rows/batch
timestamp : 5 batches, 1024 rows/batch
Two system views provide real-time HTAP monitoring. The pg_htap_layout_stats view shows per-table storage layout information, giving you a high-level overview of how data is distributed between the row store and column store across all tables in the database.
| Column | Type | Description |
|---|---|---|
table_name |
TEXT | Table name |
row_store_size |
INT8 | Estimated row store size in bytes |
column_store_size |
INT8 | Estimated column store size in bytes |
row_store_rows |
INT8 | Number of rows in row store |
column_store_rows |
INT8 | Number of rows in columnar batches |
sync_status |
TEXT | row_only, columnar_only, dual_format, or mixed |
implemented |
BOOLEAN | Whether HTAP is active |
SELECT * FROM pg_htap_layout_stats;
-- Example output:
-- table_name | row_store_size | column_store_size
-- -----------+----------------+------------------
-- orders | 64000 | 48000
-- users | 12800 | 0
--
-- row_store_rows | column_store_rows | sync_status
-- ---------------+-------------------+------------
-- 1000 | 1000 | dual_format
-- 200 | 0 | row_only
--
-- implemented
-- -------------
-- true
-- true
-- Filter for dual-format tables only
SELECT table_name,
row_store_rows,
column_store_rows,
sync_status
FROM pg_htap_layout_stats
WHERE sync_status = 'dual_format';
-- Monitor storage distribution
SELECT table_name,
row_store_size AS row_bytes,
column_store_size AS col_bytes,
ROUND(column_store_size::float
/ (row_store_size + column_store_size)
* 100, 1) AS columnar_pct
FROM pg_htap_layout_stats
WHERE implemented = true;
The helios_workload_patterns system view exposes per-column access patterns and storage recommendations based on observed query behavior. This view is populated by the ML-driven WorkloadClassifier and updated continuously as queries execute. Use it to understand how each column is accessed and whether the current storage mode is optimal.
| Column | Type | Description |
|---|---|---|
table_name |
TEXT | Table name |
column_name |
TEXT | Column name |
read_frequency |
FLOAT8 | Estimated read frequency (0.0–1.0) |
write_frequency |
FLOAT8 | Estimated write frequency (0.0–1.0) |
delete_frequency |
FLOAT8 | Delete frequency |
equality_predicates |
INT8 | Count of equality predicate usage |
range_predicates |
INT8 | Count of range predicate usage |
in_list_predicates |
INT8 | Count of IN list usage |
join_count |
INT8 | Count of join usage |
data_temperature |
TEXT | hot, warm, or cold |
cardinality |
INT8 | Estimated distinct values |
avg_selectivity |
FLOAT8 | Average filter selectivity |
filter_recommendation |
TEXT | Recommended filter type |
SELECT table_name, column_name,
data_temperature,
filter_recommendation
FROM helios_workload_patterns
WHERE table_name = 'orders';
-- Example output:
-- table_name | column_name
-- -----------+------------
-- orders | id
-- orders | customer
-- orders | amount
-- orders | created_at
--
-- data_temperature | filter_recommendation
-- -----------------+----------------------
-- hot | bloom_filter
-- hot | dictionary_index
-- warm | zone_map
-- warm | zone_map
-- Find hot columns that might benefit
-- from storage mode changes
SELECT table_name, column_name,
read_frequency,
equality_predicates,
range_predicates
FROM helios_workload_patterns
WHERE data_temperature = 'hot'
ORDER BY read_frequency DESC;
The HTAP resource manager maintains separate CPU, memory, and I/O budgets for OLTP and OLAP workloads. This isolation prevents analytical queries — which may scan millions of rows — from consuming resources needed by latency-sensitive transactional operations. The default allocation is 60% OLTP / 40% OLAP, fully configurable via heliosdb.toml.
# heliosdb.toml - Resource isolation settings
[htap]
oltp_cpu_ratio = 0.6 # 60% CPU for OLTP
olap_cpu_ratio = 0.4 # 40% CPU for OLAP
oltp_memory_ratio = 0.6 # 60% memory for OLTP
olap_memory_ratio = 0.4 # 40% memory for OLAP
oltp_max_connections = 1000
olap_max_connections = 16
# I/O priority settings
[htap.io]
oltp_io_priority = "high"
olap_io_priority = "normal"
olap_scan_throttle = false
# Workload classifier tuning
[htap.classifier]
enabled = true
oltp_threshold = 0.6
olap_threshold = 0.6
# Queries scoring below both thresholds
# are classified as HTAP (mixed) and
# can draw from either resource pool.
# Example resource allocation on a
# 16-core, 64GB server:
# OLTP: 10 cores, 38GB RAM, 1000 conns
# OLAP: 6 cores, 26GB RAM, 16 conns
HeliosCore powers the storage layer in both HeliosDB Lite and HeliosDB Full editions. HTAP dual-format storage, Direct I/O, adaptive compression, and ML-powered optimization — start building with the next-generation storage engine today.