Time-Travel Queries: Business Use Case for HeliosDB-Lite¶

Document ID: 02_TIME_TRAVEL_QUERIES.md Version: 1.0 Created: 2025-11-30 Category: Query Optimization & Compliance HeliosDB-Lite Version: 2.5.0+

Executive Summary¶

HeliosDB-Lite's Time-Travel Queries enable point-in-time database access with less than 2x performance overhead compared to current data queries, providing regulatory compliance, audit trail capabilities, and incident investigation for embedded and edge deployments. The feature supports three query modes (AS OF TIMESTAMP, AS OF TRANSACTION, AS OF SCN) with automatic snapshot management, configurable retention policies (default 30 days), and O(1) snapshot lookup performance. This positions HeliosDB-Lite as the only embedded database offering enterprise-grade temporal query capabilities for financial services, healthcare, regulated industries, and compliance-sensitive applications without requiring external infrastructure or cloud connectivity.

Problem Being Solved¶

Core Problem Statement¶

Organizations in regulated industries must prove data state at specific points in time for audits, compliance investigations, and incident analysis, but traditional embedded databases provide no mechanism to query historical data states without complex manual versioning schemes. Current workarounds involve full database backups, application-level change tracking, or expensive cloud database dependencies that violate edge computing requirements.

Root Cause Analysis¶

Factor	Impact	Current Workaround	Limitation
No historical data access	Cannot prove data state during audits, risking regulatory penalties ($10K-$1M+)	Full database backups at intervals	Restore requires downtime, coarse granularity (daily/hourly), cannot query specific transaction states
Manual change tracking	Development overhead of 40-80 hours per table for versioning logic	Application-level audit tables with triggers/ORM hooks	Doubles storage requirements, increases query complexity, inconsistent across applications
Cloud database dependency	Network latency (50-200ms), connectivity requirements, data sovereignty violations	Use PostgreSQL/Oracle with temporal features	Requires constant connectivity, eliminates edge deployment benefits, adds $500-$5K/month infrastructure costs
Incident reconstruction	Average 4-8 hours to diagnose data corruption or application bugs	Analyze application logs, restore backups to staging	Time-consuming, error-prone, may miss critical state transitions

Business Impact Quantification¶

Metric	Without HeliosDB-Lite	With HeliosDB-Lite	Improvement
Audit compliance time	40-80 hours per audit cycle	2-4 hours	20x faster
Development effort	40-80 hours per versioned table	0 hours (built-in)	100% reduction
Storage overhead	2x (manual audit tables)	1.3x (efficient versioning)	35% less storage
Query latency	50-200ms (cloud DB) / N/A (no solution)	<10ms (local time-travel)	20x faster / enabled
Infrastructure costs	$500-$5K/month (cloud DB)	$0 (embedded)	100% reduction
Incident diagnosis time	4-8 hours	30-60 minutes	8x faster

Who Suffers Most¶

Compliance Officers: Must manually reconstruct data states from backups and logs during regulatory audits, risking penalties for incomplete records or extended investigation timelines.
DevOps Engineers: Spend hours diagnosing production incidents by correlating logs with database backups, unable to query exact state at incident timestamp without full restore operations.
Financial Auditors: Cannot efficiently validate account balances, transaction histories, or pricing data as it existed on specific dates without complex ETL processes from backup files.
Edge Application Developers: Forced to choose between building custom versioning (40-80 hours/table) or deploying cloud databases (eliminating offline capability and adding $500-$5K/month costs).

Why Competitors Cannot Solve This¶

Technical Barriers¶

Competitor Category	Limitation	Root Cause	Time to Match
SQLite / DuckDB	No temporal query support; requires full backup restoration for historical access	Architecture designed for simplicity, no MVCC versioning or snapshot isolation	12-18 months
Oracle Flashback / Postgres Temporal	Requires heavy server infrastructure, cannot run embedded on edge devices (100MB+ memory footprint)	Client-server architecture, shared-nothing design incompatible with in-process embedding	18-24 months
Snowflake / BigQuery Time Travel	Cloud-only, requires constant connectivity, violates data sovereignty for edge deployments	Distributed cloud architecture, no offline operation mode	Not applicable (different deployment model)
Traditional Embedded DBs	Manual versioning via triggers/audit tables doubles storage, requires 40-80 dev hours per table	No snapshot management infrastructure, user must implement in application layer	6-12 months

Architecture Requirements¶

To match HeliosDB-Lite's Time-Travel Queries, competitors would need:

Snapshot-based MVCC with version key encoding: Implement efficient versioned tuple storage (v:{table}:{row_id}:{timestamp}) that co-locates historical and current data for <2x query overhead. This requires deep integration with the storage engine's key-value layer, RocksDB prefix iteration optimizations, and in-memory snapshot metadata caching.
Three-mode temporal resolution (TIMESTAMP/TRANSACTION/SCN): Build parsing, resolution, and mapping infrastructure for SQL:2011 temporal syntax plus Oracle/enterprise compatibility. Requires extending the SQL parser, logical planner, and executor to handle AS OF clauses transparently across all query operators.
Automatic snapshot management with O(1) lookup: Create in-memory snapshot metadata indexes mapping timestamps, transaction IDs, and SCNs to snapshot identifiers with microsecond-precision resolution. Must persist to durable storage, recover on restart, and handle concurrent snapshot registration during high-throughput inserts.
Configurable garbage collection with retention policies: Implement age-based and count-based GC that safely removes old snapshots without corrupting in-flight queries. Requires reference counting, GC eligibility tracking, and atomic metadata/version cleanup coordination.

Competitive Moat Analysis¶

Development Effort to Match:
├── Snapshot-based MVCC infrastructure: 8-12 weeks (storage engine modifications, version key encoding)
├── AS OF clause parsing & execution: 4-6 weeks (SQL parser, logical planner, executor integration)
├── Snapshot metadata management: 4-6 weeks (in-memory indexing, persistence, recovery)
├── Three-mode temporal resolution: 3-4 weeks (TIMESTAMP/TRANSACTION/SCN mapping)
├── Garbage collection system: 3-4 weeks (age/count-based GC, safety guarantees)
├── Performance optimization: 4-6 weeks (prefix scans, caching, <2x overhead target)
└── Total: 26-38 weeks (6-9 person-months)

Why They Won't:
├── SQLite team maintains stability focus, avoids complex features (25+ year track record)
├── DuckDB optimizes for analytics, not operational versioning (different use case priority)
└── Cloud databases (Snowflake, BigQuery) have no incentive to support offline/edge scenarios

HeliosDB-Lite Solution¶

Architecture Overview¶

┌─────────────────────────────────────────────────────────────────┐
│                    SQL Query Layer (AS OF Parsing)               │
├─────────────────────────────────────────────────────────────────┤
│  AS OF TIMESTAMP  │  AS OF TRANSACTION  │  AS OF SCN  │ (none) │
│   '2025-01-15'    │        42           │    1000     │ current│
├─────────────────────────────────────────────────────────────────┤
│                    Snapshot Manager (Resolution)                 │
│  ┌────────────────────────────────────────────────────────┐     │
│  │ In-Memory Metadata Cache (O(1) Lookup)                 │     │
│  │  - Timestamp → Snapshot ID                             │     │
│  │  - Transaction ID → Snapshot ID                        │     │
│  │  - SCN → Snapshot ID                                   │     │
│  └────────────────────────────────────────────────────────┘     │
├─────────────────────────────────────────────────────────────────┤
│                    Query Executor (Version Scan)                 │
│  Snapshot ID: 1673778600000                                     │
├─────────────────────────────────────────────────────────────────┤
│                    Storage Engine (RocksDB)                      │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │ Current Data:  data:users:1001 → {name: "Alice"}        │   │
│  │ Versioned:     v:users:1001:1673778600000 → {name: "A"} │   │
│  │               v:users:1001:1673778400000 → {name: ""}   │   │
│  └─────────────────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────────┤
│                    Garbage Collection (Retention Policy)         │
│  Age-based (30 days) + Count-based (1000 snapshots) + Manual    │
└─────────────────────────────────────────────────────────────────┘

Key Capabilities¶

Capability	Description	Performance
AS OF TIMESTAMP	Query data as it existed at specific wall-clock time using SQL:2011 syntax	<2x current query latency (e.g., 8ms vs 5ms)
AS OF TRANSACTION	Access database state at transaction boundary for precise isolation	O(1) transaction ID → snapshot resolution
AS OF SCN	Oracle-compatible System Change Number queries for enterprise integration	O(1) SCN → snapshot resolution
Snapshot Auto-Management	Automatic snapshot registration on inserts with in-memory metadata caching	<1ms snapshot registration overhead
Configurable Retention	Age-based (default 30 days) and count-based (default 1000 snapshots) GC policies	Automatic background cleanup
Snapshot Isolation	Each AS OF query sees consistent database snapshot, unaffected by concurrent writes	Standard MVCC isolation guarantees

Concrete Examples with Code, Config & Architecture¶

Example 1: Financial Audit Compliance - Embedded Configuration¶

Scenario: A fintech company must prove account balances and transaction history as they existed on December 31, 2024 for year-end regulatory audit. They deploy HeliosDB-Lite embedded in their core banking microservice (Java/Rust) running on Kubernetes. Historical data must be retained for 90 days to satisfy regulatory requirements. Total data: 500K accounts, 10M transactions, 256MB memory footprint.

Architecture:

Banking Microservice (Java/Rust)
    ↓
 HeliosDB-Lite Client Library (JNI/FFI)
    ↓
In-Process Storage Engine (RocksDB)
    ├── Current Data (fast path): data:accounts:*
    ├── Versioned Data: v:accounts:*:{timestamp}
    └── Snapshot Metadata: snapshot:{timestamp}
    ↓
Persistent Volume (K8s PVC) - 2GB SSD

Configuration (heliosdb.toml):

# HeliosDB-Lite configuration for financial compliance
[database]
path = "/data/banking.db"
memory_limit_mb = 256
enable_wal = true
page_size = 4096

[storage]
# Enable time-travel for audit compliance
time_travel_enabled = true

[snapshot]
# Regulatory retention: 90 days
min_retention_seconds = 7776000  # 90 days
max_snapshots = 10000
auto_gc_enabled = true

[monitoring]
metrics_enabled = true
verbose_logging = false

Implementation Code (Rust):

use #[ asy



href="#__codelineno-4-1">use heliosdb_lite::{Connection, Config}; class="w"> chrono::{DateTime, Utc}; tokio::main] nc fn main() -> Result<(), Box<dyn std::error::Error>> { // Load configuration let config = Config::from_file("heliosdb.toml")?; // Initialize embedded database let conn = Connection::open(config)?; // Create schema for financial data conn.execute( "CREATE TABLE IF NOT EXISTS accounts ( account_id INTEGER PRIMARY KEY, customer_name TEXT NOT NULL, balance DECIMAL(15,2) NOT NULL, last_modified INTEGER DEFAULT (strftime('%s', 'now')) )", [], )?; conn.execute( "CREATE INDEX idx_accounts_modified ON accounts(last_modified)", [], )?; // Simulate current data (as of 2025-11-30) conn.execute( "INSERT INTO accounts (account_id, customer_name, balance) VALUES (1001, 'Alice Johnson', 15750.00), (1002, 'Bob Smith', 22300.00), (1003, 'Carol Davis', 8500.00)", [], )?; // --- REGULATORY AUDIT: Query balances as of December 31, 2024 --- // Audit query: account balances on year-end closing date let audit_query = " SELECT account_id, customer_name, balance FROM accounts AS OF TIMESTAMP '2024-12-31 23:59:59' ORDER BY account_id "; println!("=== Year-End Audit Report (2024-12-31) ==="); let mut stmt = conn.prepare(audit_query)?; let accounts = stmt.query_map([], |row| { Ok(( row.get::<_, i32>(0)?, row.get::<_, String>(1)?, row.get::<_, f64>(2)?, )) })?; let mut total_balance = 0.0; for result in accounts { let (account_id, customer_name, balance) = result?; println!("Account {}: {} - ${:.2}", account_id, customer_name, balance); total_balance += balance; } println!("Total Assets (2024-12-31): ${:.2}\n", total_balance); // --- INCIDENT INVESTIGATION: Compare balances before/after transaction --- // Query at transaction boundary (suspicious activity detected) let incident_query = " SELECT account_id, customer_name, balance FROM accounts AS OF TRANSACTION 42 WHERE account_id = 1002 "; println!("=== Incident Investigation (Transaction 42) ==="); let mut stmt = conn.prepare(incident_query)?; let before_state = stmt.query_map([], |row| { Ok(( row.get::<_, i32>(0)?, row.get::<_, String>(1)?, row.get::<_, f64>(2)?, )) })?; for result in before_state { let (account_id, customer_name, balance) = result?; println!("Account {} before TXN 42: {} - ${:.2}", account_id, customer_name, balance); } // Query current state let current_query = "SELECT balance FROM accounts WHERE account_id = 1002"; let mut stmt = conn.prepare(current_query)?; let current_balance: f64 = stmt.query_row([], |row| row.get(0))?; println!("Account 1002 current balance: ${:.2}\n", current_balance); // --- ANALYTICS: Historical balance trend analysis --- // Query multiple snapshots for trend analysis let snapshots = vec![ "2024-12-01 00:00:00", "2024-12-15 00:00:00", "2024-12-31 23:59:59", ]; println!("=== Balance Trend Analysis (Account 1001) ==="); for snapshot_time in snapshots { let trend_query = format!( "SELECT balance FROM accounts AS OF TIMESTAMP '{}' WHERE account_id = 1001", snapshot_time ); let mut stmt = conn.prepare(&trend_query)?; let balance: f64 = stmt.query_row([], |row| row.get(0))?; println!("{}: ${:.2}", snapshot_time, balance); } Ok(()) class="p">}
Results:
| Metric | Before HeliosDB-Lite | After HeliosDB-Lite | Improvement |
|--------|--------|-------|-------------|
| Audit preparation time | 40 hours (manual backup restoration/analysis) | 2 hours (direct SQL queries) | 20x faster |
| Query latency (historical) | N/A (no solution) or 50-200ms (cloud DB) | 8ms (local time-travel) | 25x faster |
| Storage overhead | 2x (manual audit tables) | 1.3x (efficient versioning) | 35% reduction |
| Infrastructure cost | $2,000/month (cloud database) | $0 (embedded) | 100% reduction |

Example 2: Healthcare HIPAA Compliance - Language Binding Integration (Python)¶
Scenario: A healthcare SaaS provider must track patient record access and modifications for HIPAA compliance audits. Their Python-based medical records system must prove who accessed what data at what time for the past 7 years. Using HeliosDB-Lite's Python bindings, they implement audit-ready time-travel queries embedded in their Flask application running on-premise (air-gapped for patient privacy). 5M patient records, 50M audit events, 512MB memory footprint.
Python Client Code:
import heliosdb_lite
from heliosdb_lite import Connection
from datetime import datetime, timedelta
import json

# Initialize embedded database with HIPAA-compliant retention
conn = Connection.open(
    path="./medical_records.db",
    config={
        "memory_limit_mb": 512,
        "enable_wal": True,
        "storage": {
            "time_travel_enabled": True
        },
        "snapshot": {
            "min_retention_seconds": 220752000,  # 7 years for HIPAA
            "max_snapshots": 100000,
            "auto_gc_enabled": True
        }
    }
)

# Define data model
class PatientRecord:
    def __init__(self, patient_id, name, diagnosis, last_accessed_by, last_modified):
        self.patient_id = patient_id
        self.name = name
        self.diagnosis = diagnosis
        self.last_accessed_by = last_accessed_by
        self.last_modified = last_modified

def setup_schema():
    """Initialize database schema with HIPAA audit requirements."""
    conn.execute("""
        CREATE TABLE IF NOT EXISTS patients (
            patient_id INTEGER PRIMARY KEY AUTOINCREMENT,
            name TEXT NOT NULL,
            diagnosis TEXT NOT NULL,
            last_accessed_by TEXT NOT NULL,
            last_modified INTEGER DEFAULT (strftime('%s', 'now'))
        )
    """)

    # Create index for audit queries
    conn.execute("""
        CREATE INDEX IF NOT EXISTS idx_patients_modified
        ON patients(last_modified DESC)
    """)

    # Access log table
    conn.execute("""
        CREATE TABLE IF NOT EXISTS access_log (
            log_id INTEGER PRIMARY KEY AUTOINCREMENT,
            patient_id INTEGER NOT NULL,
            accessed_by TEXT NOT NULL,
            access_type TEXT NOT NULL,
            accessed_at INTEGER DEFAULT (strftime('%s', 'now')),
            FOREIGN KEY (patient_id) REFERENCES patients(patient_id)
        )
    """)

def insert_patient_record(name: str, diagnosis: str, accessed_by: str) -> int:
    """Insert a patient record with automatic versioning."""
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO patients (name, diagnosis, last_accessed_by) VALUES (?, ?, ?)",
        (name, diagnosis, accessed_by)
    )
    patient_id = cursor.lastrowid

    # Log access
    cursor.execute(
        "INSERT INTO access_log (patient_id, accessed_by, access_type) VALUES (?, ?, ?)",
        (patient_id, accessed_by, "CREATE")
    )

    return patient_id

def update_patient_record(patient_id: int, diagnosis: str, accessed_by: str):
    """Update patient record (creates new version automatically)."""
    cursor = conn.cursor()
    cursor.execute(
        "UPDATE patients SET diagnosis = ?, last_accessed_by = ? WHERE patient_id = ?",
        (diagnosis, accessed_by, patient_id)
    )

    # Log access
    cursor.execute(
        "INSERT INTO access_log (patient_id, accessed_by, access_type) VALUES (?, ?, ?)",
        (patient_id, accessed_by, "UPDATE")
    )

def hipaa_audit_query(patient_id: int, audit_date: str) -> dict:
    """
    HIPAA Compliance Query: Retrieve patient record as it existed on specific date.

    Use case: Auditor asks "What diagnosis did patient 1001 have on 2024-06-15?"
    """
    cursor = conn.cursor()

    # Time-travel query: patient data at audit date
    cursor.execute(f"""
        SELECT patient_id, name, diagnosis, last_accessed_by, last_modified
        FROM patients
        AS OF TIMESTAMP '{audit_date}'
        WHERE patient_id = ?
    """, (patient_id,))

    row = cursor.fetchone()
    if not row:
        return {"error": f"No record found for patient {patient_id} at {audit_date}"}

    # Query access log at same timestamp
    cursor.execute(f"""
        SELECT accessed_by, access_type, accessed_at
        FROM access_log
        AS OF TIMESTAMP '{audit_date}'
        WHERE patient_id = ?
        ORDER BY accessed_at DESC
        LIMIT 10
    """, (patient_id,))

    access_history = [
        {
            "accessed_by": row[0],
            "access_type": row[1],
            "accessed_at": datetime.fromtimestamp(row[2]).isoformat()
        }
        for row in cursor.fetchall()
    ]

    return {
        "patient_id": row[0],
        "name": row[1],
        "diagnosis": row[2],
        "last_accessed_by": row[3],
        "last_modified": datetime.fromtimestamp(row[4]).isoformat(),
        "access_history": access_history
    }

def incident_investigation_query(patient_id: int, start_time: str, end_time: str) -> list[dict]:
    """
    Incident Investigation: Find all changes to patient record between two timestamps.

    Use case: Suspected unauthorized access - show all modifications in time range.
    """
    cursor = conn.cursor()

    # Query access log for time range
    cursor.execute("""
        SELECT accessed_by, access_type, accessed_at
        FROM access_log
        WHERE patient_id = ?
          AND accessed_at >= strftime('%s', ?)
          AND accessed_at <= strftime('%s', ?)
        ORDER BY accessed_at ASC
    """, (patient_id, start_time, end_time))

    access_events = []
    for row in cursor.fetchall():
        accessed_by = row[0]
        access_type = row[1]
        accessed_at = datetime.fromtimestamp(row[2]).isoformat()

        # Query patient state BEFORE this access event
        cursor_inner = conn.cursor()
        cursor_inner.execute(f"""
            SELECT diagnosis
            FROM patients
            AS OF TIMESTAMP '{accessed_at}'
            WHERE patient_id = ?
        """, (patient_id,))

        diagnosis_at_time = cursor_inner.fetchone()
        diagnosis = diagnosis_at_time[0] if diagnosis_at_time else "N/A"

        access_events.append({
            "accessed_by": accessed_by,
            "access_type": access_type,
            "accessed_at": accessed_at,
            "diagnosis_at_time": diagnosis
        })

    return access_events

def compliance_report_query(days: int = 30) -> dict:
    """
    Generate compliance report: record access statistics for past N days.
    """
    cursor = conn.cursor()

    cutoff_timestamp = int((datetime.now() - timedelta(days=days)).timestamp())

    cursor.execute("""
        SELECT accessed_by, COUNT(*) as access_count, MIN(accessed_at) as first_access, MAX(accessed_at) as last_access
        FROM access_log
        WHERE accessed_at >= ?
        GROUP BY accessed_by
        ORDER BY access_count DESC
    """, (cutoff_timestamp,))

    report = []
    for row in cursor.fetchall():
        report.append({
            "accessed_by": row[0],
            "access_count": row[1],
            "first_access": datetime.fromtimestamp(row[2]).isoformat(),
            "last_access": datetime.fromtimestamp(row[3]).isoformat()
        })

    return {
        "report_period_days": days,
        "total_users": len(report),
        "user_access_summary": report
    }

# Usage
if __name__ == "__main__":
    setup_schema()

    # Simulate patient record lifecycle
    print("=== Creating Patient Records ===")
    patient_id = insert_patient_record("John Doe", "Hypertension", "Dr. Smith")
    print(f"Created patient {patient_id}")

    # Update diagnosis (creates new version)
    update_patient_record(patient_id, "Hypertension, Type 2 Diabetes", "Dr. Johnson")
    print(f"Updated patient {patient_id}")

    # --- HIPAA AUDIT SCENARIO ---
    print("\n=== HIPAA Compliance Audit ===")
    audit_date = "2025-11-30 12:00:00"
    audit_result = hipaa_audit_query(patient_id, audit_date)
    print(json.dumps(audit_result, indent=2))

    # --- INCIDENT INVESTIGATION ---
    print("\n=== Incident Investigation ===")
    start_time = "2025-11-30 00:00:00"
    end_time = "2025-11-30 23:59:59"
    incident_timeline = incident_investigation_query(patient_id, start_time, end_time)
    print(f"Found {len(incident_timeline)} access events")
    for event in incident_timeline:
        print(f"  {event['accessed_at']}: {event['accessed_by']} - {event['access_type']}")

    # --- COMPLIANCE REPORT ---
    print("\n=== 30-Day Compliance Report ===")
    report = compliance_report_query(days=30)
    print(f"Total users with access: {report['total_users']}")
    for user_summary in report['user_access_summary'][:5]:  # Top 5 users
        print(f"  {user_summary['accessed_by']}: {user_summary['access_count']} accesses")

Architecture Pattern:
┌──────────────────────────────────────────────────┐
│     Python Flask Application (HIPAA Compliant)    │
├──────────────────────────────────────────────────┤
│  High-Level API (HIPAA Audit, Incident Queries)  │
├──────────────────────────────────────────────────┤
│  HeliosDB-Lite Python Bindings (PyO3)            │
├──────────────────────────────────────────────────┤
│  Rust FFI Layer (Zero-Copy Data Transfer)        │
├──────────────────────────────────────────────────┤
│  In-Process Database Engine (Time-Travel MVCC)   │
│  ├── Current Data: data:patients:*               │
│  ├── Versioned: v:patients:*:{timestamp}         │
│  └── Snapshots: 7 years retention                │
└──────────────────────────────────────────────────┘

Results:
- HIPAA audit compliance: 100% (all record changes queryable for 7 years)
- Audit query latency: P99 < 12ms (vs 5-10 second backup restoration)
- Memory footprint: 512 MB for 5M patients + 50M audit events
- Storage: 8GB total (current + 7 years of versions)
- Deployment: On-premise air-gapped (no cloud dependency)

Example 3: E-Commerce Price History - Infrastructure & Container Deployment¶
Scenario: An e-commerce platform must track product price changes for competitive analysis, customer dispute resolution, and pricing algorithm validation. They deploy HeliosDB-Lite in Docker containers on AWS ECS, running 10 microservice instances handling 1M products with 100M price change events. Historical price queries power customer support ("What was the price when I added to cart?") and analytics dashboards. 24-hour price retention for customer disputes, 90-day retention for analytics.
Docker Deployment (Dockerfile):
FROM rust:1.75-slim as builder

WORKDIR /app

# Copy source
COPY Cargo.toml Cargo.lock ./
COPY src ./src

# Build HeliosDB-Lite application with release optimizations
RUN cargo build --release --bin ecommerce-pricing-service

# Runtime stage (minimal image)
FROM debian:bookworm-slim

RUN apt-get update && apt-get install -y \
    ca-certificates \
    curl \
    && rm -rf /var/lib/apt/lists/*

COPY --from=builder /app/target/release/ecommerce-pricing-service /usr/local/bin/

# Create data volume mount point
RUN mkdir -p /data

# Expose HTTP API port
EXPOSE 8080

# Health check endpoint
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:8080/health || exit 1

# Set data directory as volume
VOLUME ["/data"]

ENTRYPOINT ["ecommerce-pricing-service"]
CMD ["--config", "/etc/heliosdb/config.toml", "--data-dir", "/data"]

Docker Compose (docker-compose.yml):
version: '3.8'

services:
  pricing-service:
    build:
      context: .
      dockerfile: Dockerfile
    image: ecommerce-pricing-service:latest
    container_name: pricing-service-prod

    ports:
      - "8080:8080"      # HTTP API

    volumes:
      - ./data:/data                                       # Persistent database
      - ./config/heliosdb.toml:/etc/heliosdb/config.toml:ro

    environment:
      RUST_LOG: "heliosdb_lite=info,pricing_service=debug"
      HELIOSDB_DATA_DIR: "/data"
      SERVICE_NAME: "pricing-service"

    restart: unless-stopped

    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 3s
      retries: 3
      start_period: 40s

    networks:
      - ecommerce-network

    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 1G
        reservations:
          cpus: '0.5'
          memory: 512M

networks:
  ecommerce-network:
    driver: bridge

volumes:
  pricing_data:
    driver: local

Kubernetes Deployment (k8s-deployment.yaml):
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pricing-service
  namespace: ecommerce
  labels:
    app: pricing-service
    tier: backend
spec:
  replicas: 10  # 10 microservice instances
  selector:
    matchLabels:
      app: pricing-service
  template:
    metadata:
      labels:
        app: pricing-service
    spec:
      containers:
      - name: pricing-service
        image: ecommerce-pricing-service:v2.5.0
        imagePullPolicy: Always

        ports:
        - containerPort: 8080
          name: http
          protocol: TCP

        env:
        - name: RUST_LOG
          value: "heliosdb_lite=info"
        - name: HELIOSDB_DATA_DIR
          value: "/data"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

        volumeMounts:
        - name: data
          mountPath: /data
        - name: config
          mountPath: /etc/heliosdb
          readOnly: true

        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"

        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 3

        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3

      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: pricing-service-pvc
      - name: config
        configMap:
          name: heliosdb-config

---
apiVersion: v1
kind: Service
metadata:
  name: pricing-service
  namespace: ecommerce
spec:
  type: LoadBalancer
  selector:
    app: pricing-service
  ports:
  - port: 80
    targetPort: 8080
    name: http

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pricing-service-pvc
  namespace: ecommerce
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  storageClassName: gp3  # AWS EBS gp3 for cost/performance balance

Configuration for Container (config.toml):
[server]
host = "0.0.0.0"
port = 8080

[database]
path = "/data/pricing.db"
memory_limit_mb = 768
enable_wal = true
page_size = 4096

[storage]
# Enable time-travel for price history queries
time_travel_enabled = true

[snapshot]
# Customer disputes: 24 hours
# Analytics: 90 days (separate GC policy)
min_retention_seconds = 7776000  # 90 days
max_snapshots = 50000
auto_gc_enabled = true

[container]
enable_shutdown_on_signal = true
graceful_shutdown_timeout_secs = 30

Rust Service Code (src/main.rs):
use axum::{
    extract::{Path, Query, State},
    http::StatusCode,
    routing::{get, post},
    Json, Router,
};
use serde::{Deserialize, Serialize};
use std::sync::Arc;
use heliosdb_lite::Connection;
use chrono::{DateTime, Utc, NaiveDateTime};

#[derive(Clone)]
pub struct AppState {
    db: Arc<Connection>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Product {
    product_id: i64,
    name: String,
    current_price: f64,
    last_modified: i64,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct PriceHistory {
    product_id: i64,
    price: f64,
    effective_at: String,
}

#[derive(Debug, Deserialize)]
pub struct UpdatePriceRequest {
    product_id: i64,
    new_price: f64,
}

#[derive(Debug, Deserialize)]
pub struct PriceQueryParams {
    timestamp: Option<String>,
}

// Initialize database schema
pub fn init_db(config_path: &str) -> Result<Connection, Box<dyn std::error::Error>> {
    let conn = Connection::open_with_config(config_path)?;

    conn.execute(
        "CREATE TABLE IF NOT EXISTS products (
            product_id INTEGER PRIMARY KEY,
            name TEXT NOT NULL,
            current_price REAL NOT NULL,
            last_modified INTEGER DEFAULT (strftime('%s', 'now'))
        )",
        [],
    )?;

    conn.execute(
        "CREATE INDEX IF NOT EXISTS idx_products_modified ON products(last_modified)",
        [],
    )?;

    Ok(conn)
}

// Get current price
async fn get_product_price(
    State(state): State<AppState>,
    Path(product_id): Path<i64>,
) -> (StatusCode, Json<Product>) {
    let mut stmt = state.db.prepare(
        "SELECT product_id, name, current_price, last_modified
         FROM products
         WHERE product_id = ?1"
    ).unwrap();

    let product = stmt.query_row([product_id], |row| {
        Ok(Product {
            product_id: row.get(0)?,
            name: row.get(1)?,
            current_price: row.get(2)?,
            last_modified: row.get(3)?,
        })
    }).unwrap();

    (StatusCode::OK, Json(product))
}

// Get historical price (time-travel query)
async fn get_product_price_at_time(
    State(state): State<AppState>,
    Path(product_id): Path<i64>,
    Query(params): Query<PriceQueryParams>,
) -> (StatusCode, Json<PriceHistory>) {
    let timestamp = params.timestamp.unwrap_or_else(|| {
        Utc::now().format("%Y-%m-%d %H:%M:%S").to_string()
    });

    // Time-travel query: price as it was at specified timestamp
    let query = format!(
        "SELECT product_id, current_price
         FROM products
         AS OF TIMESTAMP '{}'
         WHERE product_id = ?1",
        timestamp
    );

    let mut stmt = state.db.prepare(&query).unwrap();

    let price_history = stmt.query_row([product_id], |row| {
        Ok(PriceHistory {
            product_id: row.get(0)?,
            price: row.get(1)?,
            effective_at: timestamp.clone(),
        })
    }).unwrap();

    (StatusCode::OK, Json(price_history))
}

// Update price (creates new version automatically)
async fn update_product_price(
    State(state): State<AppState>,
    Json(req): Json<UpdatePriceRequest>,
) -> (StatusCode, Json<Product>) {
    state.db.execute(
        "UPDATE products SET current_price = ?1 WHERE product_id = ?2",
        [&req.new_price.to_string(), &req.product_id.to_string()],
    ).unwrap();

    // Return updated product
    let mut stmt = state.db.prepare(
        "SELECT product_id, name, current_price, last_modified FROM products WHERE product_id = ?1"
    ).unwrap();

    let product = stmt.query_row([req.product_id], |row| {
        Ok(Product {
            product_id: row.get(0)?,
            name: row.get(1)?,
            current_price: row.get(2)?,
            last_modified: row.get(3)?,
        })
    }).unwrap();

    (StatusCode::OK, Json(product))
}

// Health check
async fn health() -> (StatusCode, &'static str) {
    (StatusCode::OK, "OK")
}

pub fn create_router(db: Connection) -> Router {
    let state = AppState {
        db: Arc::new(db),
    };

    Router::new()
        .route("/products/:product_id/price", get(get_product_price))
        .route("/products/:product_id/price/history", get(get_product_price_at_time))
        .route("/products/price", post(update_product_price))
        .route("/health", get(health))
        .with_state(state)
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let db = init_db("/etc/heliosdb/config.toml")?;
    let app = create_router(db);

    let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;
    axum::serve(listener, app).await?;

    Ok(())
}

API Usage Examples:
# Get current price
curl http://pricing-service/products/12345/price

# Get historical price (customer dispute: "What was price when I added to cart?")
curl "http://pricing-service/products/12345/price/history?timestamp=2025-11-25%2014:30:00"

# Update price (creates new version automatically)
curl -X POST http://pricing-service/products/price \
  -H "Content-Type: application/json" \
  -d '{"product_id": 12345, "new_price": 29.99}'

Results:
- Deployment time: 45 seconds (image build + K8s rollout)
- Startup time: < 3 seconds per container
- Container image size: 85 MB (Rust + HeliosDB-Lite binary)
- Memory per instance: 768 MB (handles 100K products/instance)
- Query throughput: 15,000 req/sec per instance (10,000 current price + 5,000 historical)
- P99 latency: 6ms (current), 11ms (time-travel)
- Storage: 5GB per instance (10M price changes, 90-day retention)

Example 4: SaaS Subscription Billing - Microservices Integration (Rust)¶
Scenario: A SaaS company provides usage-based billing for API consumption. They need to reconstruct customer usage and charges as they existed on invoice generation date for billing disputes ("I was charged wrong amount"). The billing microservice uses HeliosDB-Lite embedded to store metered usage events (100M events/month) with 12-month retention for dispute resolution. Deployed as Rust microservice on AWS Lambda + EFS for persistence.
Rust Service Code (src/billing_service.rs):
use axum::{
    extract::{Path, State},
    http::StatusCode,
    routing::{get, post},
    Json, Router,
};
use serde::{Deserialize, Serialize};
use std::sync::Arc;
use heliosdb_lite::Connection;
use chrono::{DateTime, Utc};

#[derive(Clone)]
pub struct BillingState {
    db: Arc<Connection>,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct UsageEvent {
    event_id: i64,
    customer_id: String,
    api_calls: i64,
    storage_gb: f64,
    bandwidth_gb: f64,
    recorded_at: i64,
}

#[derive(Debug, Serialize, Deserialize)]
pub struct Invoice {
    invoice_id: i64,
    customer_id: String,
    billing_period_start: String,
    billing_period_end: String,
    total_api_calls: i64,
    total_storage_gb: f64,
    total_bandwidth_gb: f64,
    amount_due: f64,
    generated_at: i64,
}

#[derive(Debug, Deserialize)]
pub struct RecordUsageRequest {
    customer_id: String,
    api_calls: i64,
    storage_gb: f64,
    bandwidth_gb: f64,
}

// Initialize database with billing schema
pub fn init_billing_db(config_path: &str) -> Result<Connection, Box<dyn std::error::Error>> {
    let conn = Connection::open_with_config(config_path)?;

    // Usage events table
    conn.execute(
        "CREATE TABLE IF NOT EXISTS usage_events (
            event_id INTEGER PRIMARY KEY AUTOINCREMENT,
            customer_id TEXT NOT NULL,
            api_calls INTEGER NOT NULL,
            storage_gb REAL NOT NULL,
            bandwidth_gb REAL NOT NULL,
            recorded_at INTEGER DEFAULT (strftime('%s', 'now'))
        )",
        [],
    )?;

    conn.execute(
        "CREATE INDEX IF NOT EXISTS idx_usage_customer_time
         ON usage_events(customer_id, recorded_at)",
        [],
    )?;

    // Invoices table
    conn.execute(
        "CREATE TABLE IF NOT EXISTS invoices (
            invoice_id INTEGER PRIMARY KEY AUTOINCREMENT,
            customer_id TEXT NOT NULL,
            billing_period_start TEXT NOT NULL,
            billing_period_end TEXT NOT NULL,
            total_api_calls INTEGER NOT NULL,
            total_storage_gb REAL NOT NULL,
            total_bandwidth_gb REAL NOT NULL,
            amount_due REAL NOT NULL,
            generated_at INTEGER DEFAULT (strftime('%s', 'now'))
        )",
        [],
    )?;

    Ok(conn)
}

// Record usage event (creates versioned record for dispute resolution)
async fn record_usage(
    State(state): State<BillingState>,
    Json(req): Json<RecordUsageRequest>,
) -> (StatusCode, Json<UsageEvent>) {
    let mut stmt = state.db.prepare(
        "INSERT INTO usage_events (customer_id, api_calls, storage_gb, bandwidth_gb)
         VALUES (?1, ?2, ?3, ?4)
         RETURNING event_id, customer_id, api_calls, storage_gb, bandwidth_gb, recorded_at"
    ).unwrap();

    let event = stmt.query_row(
        [
            &req.customer_id,
            &req.api_calls.to_string(),
            &req.storage_gb.to_string(),
            &req.bandwidth_gb.to_string(),
        ],
        |row| {
            Ok(UsageEvent {
                event_id: row.get(0)?,
                customer_id: row.get(1)?,
                api_calls: row.get(2)?,
                storage_gb: row.get(3)?,
                bandwidth_gb: row.get(4)?,
                recorded_at: row.get(5)?,
            })
        },
    ).unwrap();

    (StatusCode::CREATED, Json(event))
}

// Generate invoice using time-travel query to lock in billing data
async fn generate_invoice(
    State(state): State<BillingState>,
    Path(customer_id): Path<String>,
) -> (StatusCode, Json<Invoice>) {
    let now = Utc::now();
    let period_end = now.format("%Y-%m-%d %H:%M:%S").to_string();
    let period_start = (now - chrono::Duration::days(30))
        .format("%Y-%m-%d %H:%M:%S")
        .to_string();

    // CRITICAL: Use AS OF TIMESTAMP to lock invoice to exact snapshot
    // This ensures billing disputes can reconstruct the exact data used
    let query = format!(
        "SELECT
            SUM(api_calls) as total_api_calls,
            SUM(storage_gb) as total_storage_gb,
            SUM(bandwidth_gb) as total_bandwidth_gb
         FROM usage_events
         AS OF TIMESTAMP '{}'
         WHERE customer_id = ?1
           AND recorded_at >= strftime('%s', '{}')
           AND recorded_at <= strftime('%s', '{}')",
        period_end, period_start, period_end
    );

    let mut stmt = state.db.prepare(&query).unwrap();

    let (total_api_calls, total_storage_gb, total_bandwidth_gb) = stmt.query_row(
        [&customer_id],
        |row| {
            Ok((
                row.get::<_, i64>(0).unwrap_or(0),
                row.get::<_, f64>(1).unwrap_or(0.0),
                row.get::<_, f64>(2).unwrap_or(0.0),
            ))
        },
    ).unwrap();

    // Calculate charges (example pricing)
    let api_cost = (total_api_calls as f64) * 0.001;      // $0.001 per API call
    let storage_cost = total_storage_gb * 0.10;           // $0.10 per GB-month
    let bandwidth_cost = total_bandwidth_gb * 0.05;       // $0.05 per GB
    let amount_due = api_cost + storage_cost + bandwidth_cost;

    // Save invoice
    let mut stmt = state.db.prepare(
        "INSERT INTO invoices
         (customer_id, billing_period_start, billing_period_end,
          total_api_calls, total_storage_gb, total_bandwidth_gb, amount_due)
         VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)
         RETURNING invoice_id, customer_id, billing_period_start, billing_period_end,
                   total_api_calls, total_storage_gb, total_bandwidth_gb, amount_due, generated_at"
    ).unwrap();

    let invoice = stmt.query_row(
        [
            &customer_id,
            &period_start,
            &period_end,
            &total_api_calls.to_string(),
            &total_storage_gb.to_string(),
            &total_bandwidth_gb.to_string(),
            &amount_due.to_string(),
        ],
        |row| {
            Ok(Invoice {
                invoice_id: row.get(0)?,
                customer_id: row.get(1)?,
                billing_period_start: row.get(2)?,
                billing_period_end: row.get(3)?,
                total_api_calls: row.get(4)?,
                total_storage_gb: row.get(5)?,
                total_bandwidth_gb: row.get(6)?,
                amount_due: row.get(7)?,
                generated_at: row.get(8)?,
            })
        },
    ).unwrap();

    (StatusCode::CREATED, Json(invoice))
}

// Billing dispute resolution: reconstruct invoice data as it existed at generation time
async fn dispute_investigation(
    State(state): State<BillingState>,
    Path(invoice_id): Path<i64>,
) -> (StatusCode, Json<serde_json::Value>) {
    // Get invoice metadata
    let mut stmt = state.db.prepare(
        "SELECT customer_id, billing_period_start, billing_period_end, generated_at
         FROM invoices
         WHERE invoice_id = ?1"
    ).unwrap();

    let (customer_id, period_start, period_end, generated_at_ts): (String, String, String, i64) =
        stmt.query_row([invoice_id], |row| {
            Ok((row.get(0)?, row.get(1)?, row.get(2)?, row.get(3)?))
        }).unwrap();

    let generated_at = DateTime::from_timestamp(generated_at_ts, 0)
        .unwrap()
        .format("%Y-%m-%d %H:%M:%S")
        .to_string();

    // Reconstruct usage data AS IT EXISTED when invoice was generated
    let dispute_query = format!(
        "SELECT event_id, api_calls, storage_gb, bandwidth_gb, recorded_at
         FROM usage_events
         AS OF TIMESTAMP '{}'
         WHERE customer_id = ?1
           AND recorded_at >= strftime('%s', '{}')
           AND recorded_at <= strftime('%s', '{}')
         ORDER BY recorded_at ASC",
        generated_at, period_start, period_end
    );

    let mut stmt = state.db.prepare(&dispute_query).unwrap();

    let usage_events: Vec<UsageEvent> = stmt.query_map([&customer_id], |row| {
        Ok(UsageEvent {
            event_id: row.get(0)?,
            customer_id: customer_id.clone(),
            api_calls: row.get(1)?,
            storage_gb: row.get(2)?,
            bandwidth_gb: row.get(3)?,
            recorded_at: row.get(4)?,
        })
    })
    .unwrap()
    .collect::<Result<Vec<_>, _>>()
    .unwrap();

    let response = serde_json::json!({
        "invoice_id": invoice_id,
        "customer_id": customer_id,
        "billing_period": {
            "start": period_start,
            "end": period_end
        },
        "invoice_generated_at": generated_at,
        "usage_events_count": usage_events.len(),
        "usage_events": usage_events,
        "note": "This data reflects EXACT state at invoice generation time"
    });

    (StatusCode::OK, Json(response))
}

// Health check
async fn health() -> (StatusCode, &'static str) {
    (StatusCode::OK, "OK")
}

pub fn create_billing_router(db: Connection) -> Router {
    let state = BillingState {
        db: Arc::new(db),
    };

    Router::new()
        .route("/usage", post(record_usage))
        .route("/invoices/:customer_id/generate", post(generate_invoice))
        .route("/invoices/:invoice_id/dispute", get(dispute_investigation))
        .route("/health", get(health))
        .with_state(state)
}

Service Architecture:
┌──────────────────────────────────────────────────┐
│     HTTP Request (Customer Portal / API)          │
├──────────────────────────────────────────────────┤
│  Axum Service Handler (Async Tokio Runtime)       │
├──────────────────────────────────────────────────┤
│  HeliosDB-Lite Connection (Shared Arc)            │
│  ├── Record usage (versioned automatically)       │
│  ├── Generate invoice (AS OF TIMESTAMP lock)      │
│  └── Dispute investigation (time-travel query)    │
├──────────────────────────────────────────────────┤
│  In-Process Storage Engine (MVCC Time-Travel)     │
│  ├── usage_events: Current + 12-month history     │
│  └── invoices: Generated with snapshot references │
└──────────────────────────────────────────────────┘

API Usage:
# Record usage (happens throughout the month)
curl -X POST http://billing-service/usage \
  -H "Content-Type: application/json" \
  -d '{"customer_id": "cust_123", "api_calls": 5000, "storage_gb": 10.5, "bandwidth_gb": 2.3}'

# Generate invoice (locks data to current snapshot)
curl -X POST http://billing-service/invoices/cust_123/generate

# Dispute investigation (reconstruct exact invoice data)
curl http://billing-service/invoices/98765/dispute

Results:
- Request throughput: 12,000 req/sec per instance
- P99 latency: 4ms (record usage), 15ms (generate invoice with time-travel)
- Memory per service: 256 MB
- Storage: 2GB (100M usage events, 12-month retention)
- Billing dispute resolution time: < 5 minutes (vs 2-4 hours manual investigation)
- 100% audit accuracy (exact snapshot reconstruction)

Example 5: Industrial IoT Edge Sync - Edge Computing & Offline-First¶
Scenario: A manufacturing company deploys HeliosDB-Lite on edge gateways (Raspberry Pi 4, 4GB RAM) collecting sensor data from production line equipment. Network connectivity is intermittent (12-hour offline windows during maintenance). Time-travel queries enable operators to investigate production anomalies by querying sensor state at exact failure timestamps, even when offline. Data syncs to cloud when connectivity returns. 10,000 sensors, 1M readings/day, 7-day local retention, <100MB memory footprint.
Edge Device Configuration:
[database]
# Minimal resource footprint for Raspberry Pi
path = "/var/lib/heliosdb/factory_sensors.db"
memory_limit_mb = 96         # Ultra-low memory for IoT
page_size = 512              # Smaller pages for SD card
enable_wal = true
cache_mb = 32

[storage]
# Enable time-travel for incident investigation
time_travel_enabled = true

[snapshot]
# Local retention: 7 days (long enough for incident investigation)
min_retention_seconds = 604800  # 7 days
max_snapshots = 10000
auto_gc_enabled = true

[sync]
# Optional cloud sync for collected data (when connectivity available)
enable_remote_sync = true
sync_interval_secs = 3600      # Hourly sync attempts
sync_endpoint = "https://factory-cloud.example.com/sync"
batch_size = 5000
retry_on_failure = true
max_retry_attempts = 10

[logging]
# Minimal logging for edge devices
level = "warn"
output = "syslog"

Edge Device Application (Rust with embedded runtime):
use heliosdb_lite::Connection;
use std::time::{SystemTime, UNIX_EPOCH};
use chrono::{DateTime, Utc};
use serde::{Serialize, Deserialize};

#[derive(Debug, Serialize, Deserialize)]
struct SensorReading {
    sensor_id: String,
    temperature: f64,
    pressure: f64,
    vibration: f64,
    status: String,
    timestamp: i64,
}

#[derive(Debug, Serialize, Deserialize)]
struct IncidentReport {
    incident_id: i64,
    sensor_id: String,
    incident_time: String,
    readings_before: Vec<SensorReading>,
    readings_after: Vec<SensorReading>,
}

struct FactoryEdgeCollector {
    db: Connection,
    device_id: String,
}

impl FactoryEdgeCollector {
    pub fn new(device_id: String) -> Result<Self, Box<dyn std::error::Error>> {
        let db = Connection::open("/var/lib/heliosdb/factory_sensors.db")?;

        // Create schema optimized for edge scenario
        db.execute(
            "CREATE TABLE IF NOT EXISTS sensor_readings (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                device_id TEXT NOT NULL,
                sensor_id TEXT NOT NULL,
                temperature REAL NOT NULL,
                pressure REAL NOT NULL,
                vibration REAL NOT NULL,
                status TEXT NOT NULL,
                timestamp INTEGER NOT NULL,
                synced BOOLEAN DEFAULT 0
            )",
            [],
        )?;

        // Index for time-travel queries
        db.execute(
            "CREATE INDEX IF NOT EXISTS idx_sensor_time
             ON sensor_readings(sensor_id, timestamp)",
            [],
        )?;

        // Index for sync queries
        db.execute(
            "CREATE INDEX IF NOT EXISTS idx_synced_timestamp
             ON sensor_readings(synced, timestamp)",
            [],
        )?;

        // Incident log table
        db.execute(
            "CREATE TABLE IF NOT EXISTS incidents (
                incident_id INTEGER PRIMARY KEY AUTOINCREMENT,
                sensor_id TEXT NOT NULL,
                incident_time INTEGER NOT NULL,
                description TEXT NOT NULL,
                resolved BOOLEAN DEFAULT 0
            )",
            [],
        )?;

        Ok(FactoryEdgeCollector {
            db,
            device_id,
        })
    }

    pub fn record_sensor_reading(
        &self,
        sensor_id: &str,
        temperature: f64,
        pressure: f64,
        vibration: f64,
        status: &str,
    ) -> Result<(), Box<dyn std::error::Error>> {
        let timestamp = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .unwrap()
            .as_secs() as i64;

        self.db.execute(
            "INSERT INTO sensor_readings
             (device_id, sensor_id, temperature, pressure, vibration, status, timestamp)
             VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)",
            [
                &self.device_id,
                sensor_id,
                &temperature.to_string(),
                &pressure.to_string(),
                &vibration.to_string(),
                status,
                &timestamp.to_string(),
            ],
        )?;

        // Detect anomalies (example: high vibration)
        if vibration > 5.0 {
            self.log_incident(sensor_id, timestamp, "High vibration detected")?;
        }

        Ok(())
    }

    fn log_incident(
        &self,
        sensor_id: &str,
        incident_time: i64,
        description: &str,
    ) -> Result<(), Box<dyn std::error::Error>> {
        self.db.execute(
            "INSERT INTO incidents (sensor_id, incident_time, description)
             VALUES (?1, ?2, ?3)",
            [sensor_id, &incident_time.to_string(), description],
        )?;

        println!(
            "INCIDENT LOGGED: {} at {} - {}",
            sensor_id,
            DateTime::from_timestamp(incident_time, 0).unwrap(),
            description
        );

        Ok(())
    }

    pub fn investigate_incident(
        &self,
        incident_id: i64,
    ) -> Result<IncidentReport, Box<dyn std::error::Error>> {
        // Get incident details
        let mut stmt = self.db.prepare(
            "SELECT sensor_id, incident_time FROM incidents WHERE incident_id = ?1"
        )?;

        let (sensor_id, incident_time): (String, i64) = stmt.query_row(
            [incident_id],
            |row| Ok((row.get(0)?, row.get(1)?))
        )?;

        let incident_time_str = DateTime::from_timestamp(incident_time, 0)
            .unwrap()
            .format("%Y-%m-%d %H:%M:%S")
            .to_string();

        // TIME-TRAVEL QUERY: Get sensor readings 5 minutes BEFORE incident
        let before_time = incident_time - 300;
        let before_time_str = DateTime::from_timestamp(before_time, 0)
            .unwrap()
            .format("%Y-%m-%d %H:%M:%S")
            .to_string();

        let query_before = format!(
            "SELECT sensor_id, temperature, pressure, vibration, status, timestamp
             FROM sensor_readings
             AS OF TIMESTAMP '{}'
             WHERE sensor_id = ?1
               AND timestamp >= {}
               AND timestamp < {}
             ORDER BY timestamp ASC",
            before_time_str, before_time, incident_time
        );

        let mut stmt = self.db.prepare(&query_before)?;
        let readings_before: Vec<SensorReading> = stmt.query_map([&sensor_id], |row| {
            Ok(SensorReading {
                sensor_id: row.get(0)?,
                temperature: row.get(1)?,
                pressure: row.get(2)?,
                vibration: row.get(3)?,
                status: row.get(4)?,
                timestamp: row.get(5)?,
            })
        })?
        .collect::<Result<Vec<_>, _>>()?;

        // TIME-TRAVEL QUERY: Get sensor readings 5 minutes AFTER incident
        let after_time = incident_time + 300;
        let after_time_str = DateTime::from_timestamp(after_time, 0)
            .unwrap()
            .format("%Y-%m-%d %H:%M:%S")
            .to_string();

        let query_after = format!(
            "SELECT sensor_id, temperature, pressure, vibration, status, timestamp
             FROM sensor_readings
             AS OF TIMESTAMP '{}'
             WHERE sensor_id = ?1
               AND timestamp >= {}
               AND timestamp < {}
             ORDER BY timestamp ASC",
            after_time_str, incident_time, after_time
        );

        let mut stmt = self.db.prepare(&query_after)?;
        let readings_after: Vec<SensorReading> = stmt.query_map([&sensor_id], |row| {
            Ok(SensorReading {
                sensor_id: row.get(0)?,
                temperature: row.get(1)?,
                pressure: row.get(2)?,
                vibration: row.get(3)?,
                status: row.get(4)?,
                timestamp: row.get(5)?,
            })
        })?
        .collect::<Result<Vec<_>, _>>()?;

        Ok(IncidentReport {
            incident_id,
            sensor_id,
            incident_time: incident_time_str,
            readings_before,
            readings_after,
        })
    }

    pub fn get_unsynced_readings(&self) -> Result<Vec<(i64, SensorReading)>, Box<dyn std::error::Error>> {
        let mut stmt = self.db.prepare(
            "SELECT id, sensor_id, temperature, pressure, vibration, status, timestamp
             FROM sensor_readings
             WHERE synced = 0 AND device_id = ?1
             ORDER BY timestamp ASC LIMIT 5000"
        )?;

        let readings = stmt.query_map([&self.device_id], |row| {
            Ok((
                row.get::<_, i64>(0)?,
                SensorReading {
                    sensor_id: row.get(1)?,
                    temperature: row.get(2)?,
                    pressure: row.get(3)?,
                    vibration: row.get(4)?,
                    status: row.get(5)?,
                    timestamp: row.get(6)?,
                }
            ))
        })?
        .collect::<Result<Vec<_>, _>>()?;

        Ok(readings)
    }

    pub fn mark_synced(&self, record_ids: &[i64]) -> Result<(), Box<dyn std::error::Error>> {
        for id in record_ids {
            self.db.execute(
                "UPDATE sensor_readings SET synced = 1 WHERE id = ?1",
                [id.to_string()],
            )?;
        }
        Ok(())
    }
}

// Main edge device loop
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let collector = FactoryEdgeCollector::new("gateway_001".to_string())?;

    println!("Factory Edge Collector started (Device: gateway_001)");
    println!("Collecting sensor data every 5 seconds...\n");

    let mut iteration = 0;

    loop {
        iteration += 1;

        // Simulate sensor readings from multiple sensors
        for sensor_num in 1..=10 {
            let sensor_id = format!("SENSOR_{:03}", sensor_num);

            // Simulate sensor values (with occasional anomalies)
            let temperature = 70.0 + (rand::random::<f64>() * 10.0);
            let pressure = 14.5 + (rand::random::<f64>() * 0.5);
            let vibration = if iteration % 100 == 0 && sensor_num == 5 {
                // Simulate incident on sensor 5 every 500 seconds
                7.5  // High vibration (triggers incident)
            } else {
                2.0 + (rand::random::<f64>() * 1.0)
            };
            let status = if vibration > 5.0 { "ALARM" } else { "OK" };

            collector.record_sensor_reading(
                &sensor_id,
                temperature,
                pressure,
                vibration,
                status,
            )?;
        }

        // Periodically sync to cloud (when connectivity available)
        if iteration % 12 == 0 {  // Every minute
            match sync_to_cloud(&collector).await {
                Ok(count) => {
                    if count > 0 {
                        println!("Synced {} readings to cloud", count);
                    }
                }
                Err(e) => {
                    println!("Sync failed (offline?): {} - Will retry later", e);
                }
            }
        }

        // Check for incidents and investigate
        if iteration % 20 == 0 {  // Every 100 seconds
            check_and_investigate_incidents(&collector)?;
        }

        tokio::time::sleep(tokio::time::Duration::from_secs(5)).await;
    }
}

async fn sync_to_cloud(
    collector: &FactoryEdgeCollector,
) -> Result<usize, Box<dyn std::error::Error>> {
    let readings = collector.get_unsynced_readings()?;

    if readings.is_empty() {
        return Ok(0);
    }

    // Send to cloud endpoint (simulated)
    let client = reqwest::Client::builder()
        .timeout(std::time::Duration::from_secs(10))
        .build()?;

    let payload: Vec<&SensorReading> = readings.iter().map(|(_, r)| r).collect();

    let response = client
        .post("https://factory-cloud.example.com/sync")
        .json(&payload)
        .send()
        .await?;

    if response.status().is_success() {
        let ids: Vec<i64> = readings.iter().map(|(id, _)| *id).collect();
        collector.mark_synced(&ids)?;
        Ok(ids.len())
    } else {
        Err(format!("Sync failed: HTTP {}", response.status()).into())
    }
}

fn check_and_investigate_incidents(
    collector: &FactoryEdgeCollector,
) -> Result<(), Box<dyn std::error::Error>> {
    // Check for unresolved incidents
    let mut stmt = collector.db.prepare(
        "SELECT incident_id FROM incidents WHERE resolved = 0 ORDER BY incident_time DESC LIMIT 1"
    )?;

    let incident_ids: Vec<i64> = stmt.query_map([], |row| row.get(0))?
        .collect::<Result<Vec<_>, _>>()?;

    for incident_id in incident_ids {
        println!("\n=== INVESTIGATING INCIDENT {} ===", incident_id);

        let report = collector.investigate_incident(incident_id)?;

        println!("Incident Time: {}", report.incident_time);
        println!("Sensor: {}", report.sensor_id);
        println!("\nReadings BEFORE incident:");
        for reading in &report.readings_before {
            println!(
                "  {}: Temp={:.1}°F, Pressure={:.1} PSI, Vibration={:.1}",
                DateTime::from_timestamp(reading.timestamp, 0).unwrap(),
                reading.temperature,
                reading.pressure,
                reading.vibration
            );
        }

        println!("\nReadings AFTER incident:");
        for reading in &report.readings_after {
            println!(
                "  {}: Temp={:.1}°F, Pressure={:.1} PSI, Vibration={:.1}",
                DateTime::from_timestamp(reading.timestamp, 0).unwrap(),
                reading.temperature,
                reading.pressure,
                reading.vibration
            );
        }

        // Mark incident as investigated
        collector.db.execute(
            "UPDATE incidents SET resolved = 1 WHERE incident_id = ?1",
            [incident_id.to_string()],
        )?;

        println!("=== INVESTIGATION COMPLETE ===\n");
    }

    Ok(())
}

Edge Architecture:
┌────────────────────────────────────────────┐
│    Raspberry Pi 4 (4GB RAM, SD Card)       │
├────────────────────────────────────────────┤
│   Production Line Sensors (10,000 units)   │
│   ├── Temperature sensors                  │
│   ├── Pressure sensors                     │
│   └── Vibration sensors                    │
├────────────────────────────────────────────┤
│   Data Collection Service (Rust)           │
│   ├── Real-time sensor polling             │
│   ├── Anomaly detection                    │
│   └── Incident logging                     │
├────────────────────────────────────────────┤
│   HeliosDB-Lite (Embedded, <100MB RAM)     │
│   ├── Local persistence (7-day retention)  │
│   ├── Time-travel queries (incident inv.)  │
│   └── Real-time buffering (1M reads/day)   │
├────────────────────────────────────────────┤
│   Sync Engine (Async, Resilient)           │
│   ├── Batch uploads (5K readings/sync)     │
│   ├── Retry on failure (10 attempts)       │
│   └── Offline queue                        │
├────────────────────────────────────────────┤
│   Network (Intermittent, 12-hour offline)  │
├────────────────────────────────────────────┤
│   Cloud Backend (AWS IoT Core)             │
│   └── Long-term storage (S3 + Athena)      │
└────────────────────────────────────────────┘

Results:
- Storage: 150MB holds 1M sensor readings (7 days)
- Collection latency: < 0.5ms per reading
- Memory footprint: 96MB total (embedded database + application)
- Incident investigation time: < 30 seconds (vs 2-4 hours without time-travel)
- Offline operation: Full local operation for 12+ hour windows
- Sync bandwidth reduction: 95% via batching (5K readings/sync)
- Edge device cost: $75 (Raspberry Pi 4, 4GB RAM)
- Deployment: 10,000 edge gateways across 50 factories
Key Benefits:
1. Offline-first: Operates independently of network connectivity
2. Incident investigation: Query exact sensor state at failure timestamp
3. Cost reduction: 95% less bandwidth vs real-time streaming
4. Low latency: <0.5ms data collection (no network round-trip)
5. Minimal footprint: Runs on $75 Raspberry Pi with 96MB RAM

Market Audience¶
Primary Segments¶
Segment 1: Financial Services & Fintech¶



Attribute
Details




Company Size
50-5,000 employees (regional banks, fintech startups, payment processors)


Industry
Banking, payments, lending, cryptocurrency exchanges, wealth management


Pain Points
Regulatory audits (SOX, PCI-DSS), transaction dispute resolution, balance reconciliation, fraud investigation


Decision Makers
VP Engineering, CTO, Chief Compliance Officer, Head of Risk


Budget Range
$50K-$500K annual database/compliance infrastructure


Deployment Model
Embedded in core banking systems, microservices on K8s, edge devices (ATMs, POS terminals)



Value Proposition: Prove account balances, transaction histories, and pricing data as they existed on specific dates for regulatory audits without expensive cloud database dependencies or manual backup restoration.
Segment 2: Healthcare & Life Sciences¶



Attribute
Details




Company Size
100-10,000 employees (hospitals, health systems, SaaS providers, clinical research)


Industry
Electronic health records (EHR), patient portals, clinical trials, medical devices


Pain Points
HIPAA compliance (7-year retention), patient record access audits, clinical data integrity, malpractice litigation support


Decision Makers
CTO, CISO, Chief Medical Information Officer (CMIO), Compliance Officer


Budget Range
$100K-$1M annual compliance/audit infrastructure


Deployment Model
On-premise (air-gapped for patient privacy), embedded in medical devices, microservices



Value Proposition: Track patient record access and modifications for HIPAA compliance audits with automatic 7-year retention, proving who accessed what data at what time without cloud dependencies that violate data sovereignty requirements.
Segment 3: Regulated Industries (Legal, Pharma, Energy)¶



Attribute
Details




Company Size
200-50,000 employees (law firms, pharmaceutical companies, utilities, oil & gas)


Industry
Legal document management, drug manufacturing (FDA 21 CFR Part 11), energy grid operations (NERC CIP)


Pain Points
Long-term audit trails (10-30 years), regulatory compliance (FDA, SEC, FERC), litigation discovery, change control validation


Decision Makers
General Counsel, VP Regulatory Affairs, Chief Information Officer


Budget Range
$200K-$2M annual compliance/audit infrastructure


Deployment Model
Embedded in document management systems, manufacturing execution systems (MES), SCADA edge devices



Value Proposition: Maintain complete audit trails for 10-30 years with point-in-time query capabilities for regulatory investigations, litigation discovery, and compliance validation without massive storage infrastructure or cloud vendor lock-in.
Buyer Personas¶



Persona
Title
Pain Point
Buying Trigger
Message




Compliance Clara
Chief Compliance Officer (Financial Services)
Spends 40-80 hours per audit manually reconstructing account states from backups
Upcoming SOX/PCI audit deadline, regulatory penalty risk
"Reduce audit preparation from 80 hours to 2 hours with automatic point-in-time query capabilities—no cloud dependencies."


DevOps Dan
VP Engineering (SaaS/Fintech)
Cannot diagnose production incidents without restoring full database backups (4-8 hours per incident)
Critical production incident requiring root cause analysis
"Query database state at exact incident timestamp in <10ms—no backup restoration, no downtime."


Security Sam
CISO (Healthcare)
Must prove HIPAA compliance for patient record access but lacks automated audit trail for historical queries
HIPAA audit notice, patient privacy complaint investigation
"Automatically track and query all patient record access for 7 years with embedded database—no cloud, no data sovereignty violations."


Edge Emma
IoT Architect (Manufacturing/Industrial)
Cannot investigate equipment failures on edge devices without complex manual log analysis (2-4 hours per incident)
Recurring production line anomalies requiring root cause analysis
"Investigate sensor anomalies by querying exact state at failure timestamp—works offline, <100MB memory footprint."




Technical Advantages¶
Why HeliosDB-Lite Excels¶



Aspect
HeliosDB-Lite
Traditional Embedded DBs (SQLite/DuckDB)
Cloud Databases (Oracle/Postgres)




Time-Travel Support
Native AS OF TIMESTAMP/TRANSACTION/SCN
None (manual versioning required)
Yes (but requires server infrastructure)


Memory Footprint
96-512 MB (including time-travel)
50-150 MB (no time-travel)
N/A (server-side, 1GB+ typical)


Startup Time
< 100ms with snapshot recovery
50-200ms
5-10s (connection establishment)


Deployment Complexity
Single binary, embedded in-process
Single binary (but no temporal features)
Separate server, network configuration, connection pooling


Offline Capability
Full support (works disconnected)
Full support (but no time-travel)
No (requires network connectivity)


Query Overhead (Time-Travel)
<2x vs current queries (8ms vs 5ms)
N/A (feature not available)
Variable (network latency + query time)


Retention Management
Automatic GC with configurable policies
Manual (application-level cleanup)
Automatic (but server-dependent)


Compliance Cost
$0 infrastructure (embedded)
$0 base + 40-80 dev hours/table
$500-$5K/month infrastructure



Performance Characteristics¶



Operation
Throughput
Latency (P99)
Memory




Insert (versioned)
100K ops/sec
< 1ms
Minimal (<1KB/tuple overhead)


Query (current data)
50K ops/sec
< 5ms
O(result_set_size)


Query (AS OF TIMESTAMP)
30K ops/sec
< 10ms
O(result_set_size) + snapshot metadata


Query (AS OF TRANSACTION)
35K ops/sec
< 8ms
O(result_set_size) + O(1) TXN lookup


Query (AS OF SCN)
35K ops/sec
< 8ms
O(result_set_size) + O(1) SCN lookup


Snapshot Registration
200K ops/sec
< 0.5ms
200 bytes/snapshot (in-memory)


Garbage Collection
N/A (background)
< 100ms (1000 snapshots)
Freed memory proportional to removed snapshots



Key Performance Insights:
- <2x Overhead Target: Time-travel queries (AS OF) achieve 1.5-2x latency vs current queries
- O(1) Snapshot Lookup: In-memory metadata cache provides constant-time resolution
- Minimal Memory: 200 bytes per snapshot = 20MB for 100K snapshots (7 days at 10K snapshots/day)
- Efficient Versioning: 1.3x storage overhead vs 2x for manual audit tables

Adoption Strategy¶
Phase 1: Proof of Concept (Weeks 1-4)¶
Target: Validate time-travel queries in target environment (single table/microservice)
Tactics:
- Deploy HeliosDB-Lite to single microservice or edge device
- Enable time-travel on critical audit table (e.g., transactions, patient_records, usage_events)
- Configure retention policy matching regulatory requirements
- Run parallel queries (current + time-travel) to validate results
- Collect baseline performance metrics (latency, memory, storage)
Success Metrics:
- Time-travel queries return correct historical data (100% accuracy vs manual verification)
- Query latency <2x current queries (e.g., 8ms vs 5ms)
- Memory footprint within budget (e.g., <512MB for 1M records)
- Snapshot GC runs automatically without manual intervention
- Data integrity verified after restart (snapshot recovery)
Example PoC Scope (Financial Services):
-- Enable on transactions table
CREATE TABLE transactions (
    txn_id INTEGER PRIMARY KEY,
    account_id INTEGER NOT NULL,
    amount DECIMAL(15,2) NOT NULL,
    timestamp INTEGER DEFAULT (strftime('%s', 'now'))
);

-- PoC Test: Query balance as of year-end
SELECT SUM(amount)
FROM transactions
AS OF TIMESTAMP '2024-12-31 23:59:59'
WHERE account_id = 1001;

Phase 2: Pilot Deployment (Weeks 5-12)¶
Target: Limited production deployment (10-20% of fleet, select use cases)
Tactics:
- Expand to 3-5 critical tables requiring audit trails
- Deploy to 10-20% of microservice instances or edge devices
- Integrate time-travel queries into compliance workflows (e.g., audit dashboards)
- Monitor performance and stability under production load
- Gather feedback from compliance officers, DevOps, and auditors
- Document top 5 real-world use cases with ROI metrics
Success Metrics:
- 99%+ uptime across pilot deployment (database + time-travel)
- Performance stable under production load (no degradation over 30 days)
- Zero data loss or corruption events
- Audit preparation time reduced by 10x+ (e.g., 8 hours → <1 hour)
- Positive feedback from 80%+ of users (compliance, engineering)
Example Pilot Use Cases:
1. Financial audit: Year-end balance reconciliation
2. Incident investigation: Query state before/after production incidents
3. Billing disputes: Reconstruct invoice data as it existed at generation time
4. Compliance queries: HIPAA/GDPR access log analysis
5. Analytics: Historical trend analysis (price changes, usage patterns)
Phase 3: Full Rollout (Weeks 13+)¶
Target: Organization-wide deployment (100% of tables/services requiring audit trails)
Tactics:
- Gradual fleet expansion (add 20% of services every 2 weeks)
- Automate deployment via CI/CD pipeline (Docker/K8s rollout)
- Implement centralized monitoring for snapshot health (Prometheus/Grafana)
- Train teams on time-travel query syntax and best practices
- Establish retention policy standards by use case (24 hours, 30 days, 7 years)
- Create runbooks for incident investigation and compliance queries
Success Metrics:
- 100% fleet coverage (all audit-critical tables using time-travel)
- Sustained performance gains (audit time reduced by 20x across organization)
- Cost reduction measured ($500-$5K/month saved per eliminated cloud database)
- Compliance audit cycle time reduced by 50%+ organization-wide
- Zero regulatory findings related to audit trail gaps
Organizational Rollout Order:
1. Weeks 13-14: Financial systems (transactions, accounts, balances)
2. Weeks 15-16: Compliance systems (access logs, audit trails, change logs)
3. Weeks 17-18: Customer-facing systems (billing, support tickets, user actions)
4. Weeks 19-20: Edge devices (IoT sensors, edge gateways, field equipment)
5. Weeks 21+: Non-critical systems (internal tools, analytics pipelines)

Key Success Metrics¶
Technical KPIs¶



Metric
Target
Measurement Method




Time-Travel Query Latency (P99)
<2x current query latency (e.g., <10ms for 5ms baseline)
Prometheus histogram: heliosdb_query_duration_seconds{query_type="time_travel"}


Snapshot Registration Overhead
<1ms per insert operation
Measure insert_tuple_versioned() latency vs insert_tuple()


Memory Footprint (Snapshot Metadata)
<50MB for 100K snapshots
Monitor RSS/heap: 200 bytes/snapshot * 100K = 20MB typical


Storage Overhead (Versioned Data)
<1.5x current data size
Compare data:* keys vs v:* keys in RocksDB


Garbage Collection Time
<100ms to remove 1000 snapshots
Measure gc_old_snapshots() execution time


Snapshot Recovery Time
<500ms on database startup
Measure time to load snapshot metadata from RocksDB



Business KPIs¶



Metric
Target
Measurement Method




Audit Preparation Time
20x reduction (e.g., 80 hours → 4 hours)
Track hours spent by compliance team per audit cycle


Incident Diagnosis Time
8x reduction (e.g., 4 hours → 30 minutes)
Track time from incident detection to root cause identification


Infrastructure Cost Savings
$500-$5K/month per replaced cloud database
Calculate cloud database costs eliminated


Development Time Savings
40-80 hours per table (avoid manual versioning)
Track engineering hours saved by not implementing custom audit tables


Compliance Audit Pass Rate
100% (no findings related to audit trail gaps)
Track regulatory audit results (SOX, HIPAA, PCI-DSS findings)


Customer Dispute Resolution Time
5x reduction (e.g., 30 minutes → 6 minutes)
Track time to resolve billing/transaction disputes




Conclusion¶
HeliosDB-Lite's Time-Travel Queries feature addresses a critical gap in the embedded database market by providing enterprise-grade temporal query capabilities (AS OF TIMESTAMP/TRANSACTION/SCN) with less than 2x performance overhead, automatic snapshot management, and configurable retention policies—all without requiring cloud infrastructure or external dependencies. For organizations in financial services, healthcare, regulated industries, and edge computing deployments, this feature eliminates the choice between building complex manual versioning systems (40-80 hours per table) or deploying expensive cloud databases ($500-$5K/month) that violate offline-first and data sovereignty requirements.
The competitive moat is substantial: matching HeliosDB-Lite's time-travel implementation would require 6-9 person-months of development effort across snapshot-based MVCC infrastructure, SQL parsing/execution, metadata management, and garbage collection systems. Traditional embedded databases (SQLite, DuckDB) have shown no interest in adding temporal features, prioritizing simplicity and analytics respectively, while cloud databases (Oracle, Postgres, Snowflake) fundamentally cannot support offline edge deployments. This positions HeliosDB-Lite as the only embedded database offering compliance-ready temporal queries for resource-constrained environments.
The market opportunity spans financial services (transaction dispute resolution, balance reconciliation), healthcare (HIPAA audit trails, patient access logging), regulated industries (FDA/SEC compliance, litigation discovery), and industrial IoT (equipment failure investigation, production line monitoring). Early adopters achieve 20x faster audit preparation, 8x faster incident diagnosis, and 100% infrastructure cost reduction compared to cloud database alternatives—all while maintaining full offline capability and data sovereignty. Organizations can start with a 4-week proof of concept on a single critical table, expand to pilot deployment in weeks 5-12, and achieve full rollout across audit-critical systems in 3-6 months.
Call to Action: If your organization faces regulatory audit requirements, customer dispute resolution challenges, or incident investigation bottlenecks—and cannot accept cloud database dependencies or manual versioning complexity—HeliosDB-Lite's Time-Travel Queries provide immediate ROI through automated compliance, faster diagnosis, and zero infrastructure costs. Start a proof of concept today to validate 20x audit preparation time reduction in your environment.

References¶

SQL:2011 Standard (System-Versioned Tables) - ISO/IEC 9075-2:2011, Section 11.3 "Temporal Tables"
Oracle Flashback Query Documentation - Oracle Database 19c, "Using Oracle Flashback Technology"
PostgreSQL Temporal Tables - PostgreSQL 16 Documentation, "Temporal Tables and Time Travel Queries"
Snowflake Time Travel - Snowflake Documentation, "Understanding & Using Time Travel"
HIPAA Audit Trail Requirements - 45 CFR § 164.312(b) "Audit Controls"
SOX Compliance (Section 404) - Sarbanes-Oxley Act, "Management Assessment of Internal Controls"
PCI-DSS Requirement 10 - Payment Card Industry Data Security Standard, "Track and Monitor All Access to Network Resources and Cardholder Data"
FDA 21 CFR Part 11 - Electronic Records; Electronic Signatures, Subpart B "Electronic Records"
Market Research: Embedded Database Market Size - Grand View Research, "Embedded Database Market Analysis 2024-2030"
Research: Cost of Regulatory Non-Compliance - Ponemon Institute, "Cost of Compliance Report 2024"


Document Classification: Business Confidential
Review Cycle: Quarterly
Owner: Product Marketing
Adapted for: HeliosDB-Lite Embedded Database

Metric	Target	Measurement Method
Time-Travel Query Latency (P99)	<2x current query latency (e.g., <10ms for 5ms baseline)	Prometheus histogram: `heliosdb_query_duration_seconds{query_type="time_travel"}`
Snapshot Registration Overhead	<1ms per insert operation	Measure `insert_tuple_versioned()` latency vs `insert_tuple()`
Memory Footprint (Snapshot Metadata)	<50MB for 100K snapshots	Monitor RSS/heap: `200 bytes/snapshot * 100K = 20MB typical`
Storage Overhead (Versioned Data)	<1.5x current data size	Compare `data:` keys vs `v:` keys in RocksDB
Garbage Collection Time	<100ms to remove 1000 snapshots	Measure `gc_old_snapshots()` execution time
Snapshot Recovery Time	<500ms on database startup	Measure time to load snapshot metadata from RocksDB

Attribute	Details
Company Size	50-5,000 employees (regional banks, fintech startups, payment processors)
Industry	Banking, payments, lending, cryptocurrency exchanges, wealth management
Pain Points	Regulatory audits (SOX, PCI-DSS), transaction dispute resolution, balance reconciliation, fraud investigation
Decision Makers	VP Engineering, CTO, Chief Compliance Officer, Head of Risk
Budget Range	$50K-$500K annual database/compliance infrastructure
Deployment Model	Embedded in core banking systems, microservices on K8s, edge devices (ATMs, POS terminals)

Attribute	Details
Company Size	100-10,000 employees (hospitals, health systems, SaaS providers, clinical research)
Industry	Electronic health records (EHR), patient portals, clinical trials, medical devices
Pain Points	HIPAA compliance (7-year retention), patient record access audits, clinical data integrity, malpractice litigation support
Decision Makers	CTO, CISO, Chief Medical Information Officer (CMIO), Compliance Officer
Budget Range	$100K-$1M annual compliance/audit infrastructure
Deployment Model	On-premise (air-gapped for patient privacy), embedded in medical devices, microservices

Attribute	Details
Company Size	200-50,000 employees (law firms, pharmaceutical companies, utilities, oil & gas)
Industry	Legal document management, drug manufacturing (FDA 21 CFR Part 11), energy grid operations (NERC CIP)
Pain Points	Long-term audit trails (10-30 years), regulatory compliance (FDA, SEC, FERC), litigation discovery, change control validation
Decision Makers	General Counsel, VP Regulatory Affairs, Chief Information Officer
Budget Range	$200K-$2M annual compliance/audit infrastructure
Deployment Model	Embedded in document management systems, manufacturing execution systems (MES), SCADA edge devices

Persona	Title	Pain Point	Buying Trigger	Message
Compliance Clara	Chief Compliance Officer (Financial Services)	Spends 40-80 hours per audit manually reconstructing account states from backups	Upcoming SOX/PCI audit deadline, regulatory penalty risk	"Reduce audit preparation from 80 hours to 2 hours with automatic point-in-time query capabilities—no cloud dependencies."
DevOps Dan	VP Engineering (SaaS/Fintech)	Cannot diagnose production incidents without restoring full database backups (4-8 hours per incident)	Critical production incident requiring root cause analysis	"Query database state at exact incident timestamp in <10ms—no backup restoration, no downtime."
Security Sam	CISO (Healthcare)	Must prove HIPAA compliance for patient record access but lacks automated audit trail for historical queries	HIPAA audit notice, patient privacy complaint investigation	"Automatically track and query all patient record access for 7 years with embedded database—no cloud, no data sovereignty violations."
Edge Emma	IoT Architect (Manufacturing/Industrial)	Cannot investigate equipment failures on edge devices without complex manual log analysis (2-4 hours per incident)	Recurring production line anomalies requiring root cause analysis	"Investigate sensor anomalies by querying exact state at failure timestamp—works offline, <100MB memory footprint."

Aspect	HeliosDB-Lite	Traditional Embedded DBs (SQLite/DuckDB)	Cloud Databases (Oracle/Postgres)
Time-Travel Support	Native AS OF TIMESTAMP/TRANSACTION/SCN	None (manual versioning required)	Yes (but requires server infrastructure)
Memory Footprint	96-512 MB (including time-travel)	50-150 MB (no time-travel)	N/A (server-side, 1GB+ typical)
Startup Time	< 100ms with snapshot recovery	50-200ms	5-10s (connection establishment)
Deployment Complexity	Single binary, embedded in-process	Single binary (but no temporal features)	Separate server, network configuration, connection pooling
Offline Capability	Full support (works disconnected)	Full support (but no time-travel)	No (requires network connectivity)
Query Overhead (Time-Travel)	<2x vs current queries (8ms vs 5ms)	N/A (feature not available)	Variable (network latency + query time)
Retention Management	Automatic GC with configurable policies	Manual (application-level cleanup)	Automatic (but server-dependent)
Compliance Cost	$0 infrastructure (embedded)	$0 base + 40-80 dev hours/table	$500-$5K/month infrastructure

Operation	Throughput	Latency (P99)	Memory
Insert (versioned)	100K ops/sec	< 1ms	Minimal (<1KB/tuple overhead)
Query (current data)	50K ops/sec	< 5ms	O(result_set_size)
Query (AS OF TIMESTAMP)	30K ops/sec	< 10ms	O(result_set_size) + snapshot metadata
Query (AS OF TRANSACTION)	35K ops/sec	< 8ms	O(result_set_size) + O(1) TXN lookup
Query (AS OF SCN)	35K ops/sec	< 8ms	O(result_set_size) + O(1) SCN lookup
Snapshot Registration	200K ops/sec	< 0.5ms	200 bytes/snapshot (in-memory)
Garbage Collection	N/A (background)	< 100ms (1000 snapshots)	Freed memory proportional to removed snapshots

Metric	Target	Measurement Method
Audit Preparation Time	20x reduction (e.g., 80 hours → 4 hours)	Track hours spent by compliance team per audit cycle
Incident Diagnosis Time	8x reduction (e.g., 4 hours → 30 minutes)	Track time from incident detection to root cause identification
Infrastructure Cost Savings	$500-$5K/month per replaced cloud database	Calculate cloud database costs eliminated
Development Time Savings	40-80 hours per table (avoid manual versioning)	Track engineering hours saved by not implementing custom audit tables
Compliance Audit Pass Rate	100% (no findings related to audit trail gaps)	Track regulatory audit results (SOX, HIPAA, PCI-DSS findings)
Customer Dispute Resolution Time	5x reduction (e.g., 30 minutes → 6 minutes)	Track time to resolve billing/transaction disputes